
Contents
Environmental forecasting suites generate forecast products from a potentially large group of interdependent scientific models and associated data processing tasks. They are constrained by availability of external driving data: typically one or more tasks will wait on real time observations and/or model data from an external system, and these will drive other downstream tasks, and so on. The dependency diagram for a single forecast cycle in such a system is a Directed Acyclic Graph as shown in Figure 1 (in our terminology, a forecast cycle is comprised of all tasks with a common cycle time, which is the nominal analysis time or start time of the forecast models in the group). In real time operation processing will consist of a series of distinct forecast cycles that are each initiated, after a gap, by arrival of the new cycle’s external driving data.
From a job scheduling perspective task execution order in such a system must be carefully controlled in order to avoid dependency violations. Ideally, each task should be queued for execution at the instant its last prerequisite is satisfied; this is the best that can be done even if queued tasks are not able to execute immediately because of resource contention.
Cylc was developed for the EcoConnect Forecasting System at NIWA (National Institute of Water and Atmospheric Research, New Zealand). EcoConnect takes real time atmospheric and stream flow observations, and operational global weather forecasts from the Met Office (UK), and uses these to drive global sea state and regional data assimilating weather models, which in turn drive regional sea state, storm surge, and catchment river models, plus tide prediction, and a large number of associated data collection, quality control, preprocessing, post-processing, product generation, and archiving tasks.1 The global sea state forecast runs once daily. The regional weather forecast runs four times daily but it supplies surface winds and pressure to several downstream models that run only twice daily, and precipitation accumulations to catchment river models that run on an hourly cycle assimilating real time stream flow observations and using the most recently available regional weather forecast. EcoConnect runs on heterogeneous distributed hardware, including a massively parallel supercomputer and several Linux servers.
Most dependence between tasks applies within a single forecast cycle. Figure 1 shows the dependency diagram for a single forecast cycle of a simple example suite of three forecast models (a, b, and c) and three post processing or product generation tasks (d, e and f). A scheduler capable of handling this must manage, within a single forecast cycle, multiple parallel streams of execution that branch when one task generates output for several downstream tasks, and merge when one task takes input from several upstream tasks.


Figure 2 shows the optimal job schedule for two consecutive cycles of the example suite in real time operation, given execution times represented by the horizontal extent of the task bars. There is a time gap between cycles as the suite waits on new external driving data. Each task in the example suite happens to trigger off upstream tasks finishing, rather than off any intermediate output or event; this is merely a simplification that makes for clearer diagrams.


Now the question arises, what happens if the external driving data for upcoming cycles is available in advance, as it would be after a significant delay in operations, or when running a historical case study? While the forecast model a appears to depend only on the external data x at this stage of the discussion, in fact it would typically also depend on its own previous instance for the model background state used in initializing the new forecast. Thus, as alluded to in Figure 3, task a could in principle start as soon as its predecessor has finished. Figure 4 shows, however, that starting a whole new cycle at this point is dangerous - it results in dependency violations in half of the tasks in the example suite. In fact the situation could be even worse than this - imagine that task b in the first cycle is delayed for some reason after the second cycle has been launched. Clearly we must consider handling inter-cycle dependence explicitly or else agree not to start the next cycle early, as is illustrated in Figure 5.
Forecast models typically depend on their own most recent previous forecast for background state or restart files of some kind (this is called warm cycling) but there can also be inter-cycle dependence between different tasks. In an atmospheric forecast analysis suite, for instance, the weather model may generate background states for observation processing and data-assimilation tasks in the next cycle as well as for then next forecast model run. In real time operation inter-cycle dependence can be ignored because it is automatically satisfied when one cycle finishes before the next begins. If it is not ignored it drastically complicates the dependency graph by blurring the clean boundary between cycles. Figure 6 illustrates the problem for our simple example suite assuming minimal inter-cycle dependence: the warm cycled models (a, b, and c) each depend on their own previous instances.
For this reason, and because we tend to see forecasting suites in terms of by their real time characteristics, other metaschedulers have ignored inter-cycle dependence and are thus restricted to running entire cycles in sequence at all times. This does not affect normal real time operation but it can be a serious impediment when advance availability of external driving data makes it possible, in principle, to run some tasks from upcoming cycles before the current cycle is finished - as was suggested at the end of the previous section. This can occur, for instance, after operational delays (late arrival of external data, system maintenance, etc.) and to an even greater extent in historical case studies and parallel test suites started behind a real time operation. It can be a serious problem for suites that have little downtime between forecast cycles and therefore take many cycles to catch up after a delay. Without taking account of inter-cycle dependence, the best that can be done, in general, is to reduce the gap between cycles to zero as shown in Figure 5. A limited crude overlap of the single cycle job schedule may be possible for specific task sets but the allowable overlap may change if new tasks are added, and it is still dangerous: it amounts to running different parts of a dependent system as if they were not dependent and as such it cannot be guaranteed that some unforeseen delay in one cycle, after the next cycle has begun, (e.g. due to resource contention or task failures) won’t result in dependency violations.


Figure 7 shows, in contrast to Figure 4, the optimal two cycle job schedule obtained by respecting all inter-cycle dependence. This assumes no delays due to resource contention or otherwise - i.e. every task runs as soon as it is ready to run. The scheduler running this suite must be able to adapt dynamically to external conditions that impact on multi-cycle scheduling in the presence of inter-cycle dependence or else, again, risk bringing the system down with dependency violations.


To further illustrate the potential benefits of proper inter-cycle dependency handling, Figure 8 shows an operational delay of almost one whole cycle in a suite with little downtime between cycles. Above the time axis is the optimal schedule that is possible in principle when inter-cycle dependence is taken into account, and below it is the only safe schedule possible in general when it is ignored. In the former case, even the cycle immediately after the delay is hardly affected, and subsequent cycles are all on time, whilst in the latter case it takes five full cycles to catch up to normal real time operation.
Similarly, Figure 9 shows example suite job schedules for an historical case study, or when catching up after a very long delay; i.e. when the external driving data are available many cycles in advance. Task a, which as the most upstream forecast model is likely to be a resource intensive atmosphere or ocean model, has no upstream dependence on co-temporal tasks and can therefore run continuously, regardless of how much downstream processing is yet to be completed in its own, or any previous, forecast cycle (actually, task a does depend on co-temporal task x which waits on the external driving data, but that returns immediately when the data is available in advance, so the result stands). The other forecast models can also cycle continuously or with a short gap between, and some post processing tasks, which have no previous-instance dependence, can run continuously or even overlap (e.g. e in this case). Thus, even for this very simple example suite, tasks from three or four different cycles can in principle run simultaneously at any given time. In fact, if our tasks are able to trigger off internal outputs of upstream tasks, rather than waiting on full completion, successive instances of the forecast models could overlap as well (because model restart outputs are generally completed early in the forecast) for an even more efficient job schedule.

Cylc manages a pool of proxy objects that represent the real tasks in a suite. Task proxies know how to run the real tasks that they represent, and they receive progress messages from the tasks as they run (usually reports of completed outputs). There is no global cycling mechanism to advance the suite; instead individual task proxies have their own private cycle time and spawn their own successors when the time is right. Task proxies are self-contained - they know their own prerequisites and outputs but are not aware of the wider suite. Inter-cycle dependence is not treated as special, and the task pool can be populated with tasks with many different cycle times. The task pool is illustrated in Figure 10. Whenever any task changes state due to completion of an output, every task checks to see if its own prerequisites have been satisfied. In effect, cylc gets a pool of tasks to self-organize by negotiating their own dependencies so that optimal scheduling, as described in the previous section, emerges naturally at run time.
The following packages are technically optional as you can construct and run cylc suites without dependency graphing, the gcylc GUI, or template processing but this is not recommended, and without Jinja2 you will not be able to run many of the example suites:
If you use a binary package manager to install graphviz you may also need a couple of devel packages for the pygraphviz build:
This user guide can be generated from the LATEXsource by running make in the top level cylc directory after download. The following TEXpackages are required (but note that the exact packages required may be somewhat OS or distribution-dependent):
And for HTML versions of the User Guide:
Finally, cylc makes heavy use of Python ordered dictionary data structures. Significant speedup in parsing large suites can be had by installing the fast C-coded ordereddict module by Anthon van der Neut:
This module is currently included with cylc under $CYLC_DIR/ext, and is built by the top level cylc Makefile. If you install the resulting library appropriately cylc will automatically use it in place of a slower Python implementation of the ordered dictionary structure.
Cylc should run “out of the box” on recent Linux distributions.
For distributed suites the Pyro versions installed on all suite or task hosts must be mutually compatible. Using identical Pyro versions guarantees compatibility but may not be strictly necessary because cylc uses Pyro rather minimally.
Beware of Linux distributions that come packaged with old Pyro versions. Pyro 3.9 and earlier is not compatible with the new-style Python classes used in cylc. It has been reported that Ubuntu 10.04 (Lucid Lynx), released in September 2009, suffers from this problem. Surprisingly, so does Ubuntu 11.10 (Oneiric Ocelot), released in October 2011 - and therefore, presumably, all earlier Ubuntu releases. Attempting to run a suite with Pyro 3.9 or earlier installed results in the following Python traceback:
It has been reported that cylc runs fine on OSX 10.6 SnowLeopard, but on OSX 10.7 Lion there is an issue with constructing proper FQDNs (Fully Qualified Domain Names) that requires a change to the DNS service. Here’s how to solve the problem:
Cylc has incorporated a custom-modified version the xdot graph viewer (http://code.google.com/p/jrfonseca/wiki/XDot, LGPL license).
First install Pyro, graphviz, Pygraphviz, Jinja2, TEX, and ImageMagick using the package manager on your system if possible; otherwise download the packages manually and follow their native installation documentation. On a modern Linux system, this is very easy. For example, to install cylc-5.1.0 on the Fedora 18 Linux distribution:
If you do not have root access on your intended cylc host machine and cannot get a sysadmin to do this at system level, see Section 4.4 for tips on installing everything to a local user account.
Now check that everything other than the LATEXpackages is installed properly:
If this command reports any errors then the packages concerned are not installed, not in the system Python search path, or (for a local install) not present in your $PYTHONPATH variable.
Cylc installs into a normal user account, as an unpacked release tarball or a git repository clone. See the INSTALL file in the source tree for instructions (also listed in Section G).
Site and user config files define some important parameters that affect all suites, some of which may need to be customized for your site. Section 6 describes how to generate an initial site file and where to install it. All legal site and user config items are defined Appendix B.
Cylc has a battery of self-diagnosing tests, invoked by the command cylc test-battery. These are primarily intended to check that new developments don’t break existing functionality, but you can also run them after installation to check that everything works properly. See cylc test-battery --help before running the tests.
It is possible to install cylc and all of its software prerequisites under your own user account. Cylc itself is already designed to be installed into a normal user account, just follow the instructions above in Section 4.2. For the other packages, depending on the installation method used for each, it is just a matter of learning how to change the default install path prefix from, for example, /usr/local to $HOME/installed/usr/local and then ensuring that the resulting local package paths are set properly in your PYTHONPATH environment variable.
The graphviz build reportedly may fail on systems that do not have QT installed, hence the ./configure --with-qt=no option above. The graphviz lib and include locations are required when installing Pygraphviz.
Finally, check that everything (other than LATEXfor document processing) is installed:
If this command reports any errors then the packages concerned are not installed, not in the system Python search path, or (for a local install) not present in your $PYTHONPATH variable.
Upgrading is just a matter of unpacking the new cylc release. Successive cylc releases can be installed in parallel as suggested in the INSTALL file (Section G).
You may be accustomed to the idea that a forecasting suite has a “current cycle time”, which is typically the analysis time or nominal start time of the main forecast model(s) in the suite, and that the whole suite advances to the next forecast cycle when all tasks in the current cycle have finished (or even when a particular wall clock time is reached, in real time operation). As explained in the Introduction, this is not how cylc works.
Cylc suites advance by means of individual tasks with private cycle times independently spawning successors at the next valid cycle time for the task, not by incrementing a suite-wide forecast cycle. Each task will be submitted when its own prerequisites are satisfied, regardless of other tasks with other cycle times running, or not, at the time. It may still be convenient at times, however, to refer to the “current cycle”, the “previous cycle”, or the “next cycle” and so forth, with reference to a particular task, or in the sense of all tasks that “belong to” a particular forecast cycle. But keep in mind that the members of these groups may not be present simultaneously in the running suite - i.e. different tasks may pass through the “current cycle” (etc.) at different times as the suite evolves, particularly in delayed (catch up) operation.
Cylc site and user configuration files contain settings that affect all suites. Some of these, such as the range of network ports used by cylc, should be set at site level,
Others, such as the preferred text editor for suite definitions, can be overridden by users,
The cylc get-global-config command retrieves current global settings consisting of cylc defaults overridden by site settings, if any, overridden by user settings, if any. To generate an initial site or user config file:
Settings that do not need to be changed should be deleted or commented out of user config files so that they don’t override future changes to the site file.
Legal items, values, and system defaults are documented in the Site And User Config File Reference, Section B.
This section provides a hands-on tutorial introduction to basic cylc suite preparation and control. A number of features are not yet touched on by the tutorial examples, however, so please also read the rest of the User Guide.
Some global parameters affecting cylc’s behaviour are defined in a site config file, and can be customized per user in user config files. For example, to choose the text editor invoked by cylc on suite definitions:
Cylc has command line (CLI) and graphical (GUI) user interfaces. To get access to them you just need the cylc bin directory in your shell search path:
The command line interface is unified under a single top level cylc command that provides access to many sub-commands and their help documentation.
The cylc GUI covers the same functionality as the CLI with the addition of live suite monitoring capability, and it is intended to be easier to use without expert knowledge. It can start and stop suites, or connect to suites that are already running; in either case, shutting down the GUI does not have affect the suite itself.
Clicking on a suite in the summary GUI, shown in Figure 15, opens a gcylc instance for it.
Cylc suites are defined by extended-INI format suite.rc files (the main file format extension is section nesting). These reside in suite definition directories that may also contain a bin directory and any other suite-related files.
Suite registration associates a name with a suite definition directory, in a simple database. Cylc commands that parse suite definition files can take the file path or the suite name as input; commands that interact with running suites have to target the suite by name.
At registration time a random string of characters is written to a file called passphrase in the suite definition directory. At run time any contact from cylc client programs (running tasks, user commands, the cylc GUI) must use the same passphrase to authenticate with the running suite. This prevents unauthorized users interfering in your suites (network communication between running processes is not subject to Unix user account permissions). Local tasks and user commands on the suite host automatically use the passphrase in the suite definition directory. For remote tasks and commands, however, the passphrase must be installed appropriately on the remote account - see Section 7.16 below.
Run the following command to import cylc’s example suites to a chosen directory location and register them for use under the examples name group:
(first check that $TMPDIR is defined in your environment, or else use a different location). List the newly registered tutorial suites using the cylc print command:
See cylc db print --help for other display options. The tree-form display shows how hierarchical suite names can be used to organize related suites nicely (suite names do not have to be related to their source directory paths, although they are in this case):
Rename (re-register) the tutorial suites to make their names a bit shorter:
Suite definitions can be validated against the suite.rc file format specification to detect many types of error without running the suite.
Here’s the traditional Hello World program rendered as a cylc suite:
Cylc suites feature a clean separation of scheduling configuration, which determines when tasks are ready to run; and runtime configuration, which determines what to run (and where and how to run it) when a task is ready. In this example the [scheduling] section defines a single task called hello that triggers immediately when the suite starts up. When the task finishes the suite shuts down. That this is a dependency graph will be more obvious when more tasks are added. Under the [runtime] section the command scripting item defines a simple inlined implementation for hello: it sleeps for ten seconds, then prints Hello World!, and exits. This ends up in a job script generated by cylc to encapsulate the task (below) and, thanks to some some defaults designed to allow quick prototyping of new suites, it is submitted to run as a background job on the suite host. In fact cylc even provides a default task implementation that makes the entire [runtime] section technically optional:
(the resulting dummy task just prints out some identifying information and exits).
The text editor invoked by cylc on suite definitions is determined by cylc site and user config files, as shown above in Section 7.2. Check that you have renamed the tutorial examples suites as described just above and open the Hello World suite definition in your text editor:
Alternatively, start gcylc on the suite,
and choose Suite → Edit from the menu.
The editor will be invoked from the suite definition directory for easy access to other suite files (in this case there are none). There are syntax highlighting control files for several text editors under /path/to/cylc/conf/; see in-file comments for installation instructions.
Run the suite at the terminal with the cylc run command:
The --no-detach option tells cylc not to daemonize so that output is printed to the terminal. When the task is ready to run cylc generates a special job script to run it. The command line used to submit the job script, which depends on the task’s job submission method and host machine, is printed to suite stdout. Messages subsequently received from the running task are also printed. More detailed information is written, time-stamped, to a suite log. The suite automatically shuts down when and if all tasks have succeeded.
The cylc GUI can start and stop suites, or (re)connect to suites that are already running: gcylc
use the tool bar Play button, or the Control → Run menu item, to run the suite again. You may want to alter the suite definition slightly to make the task take longer to run. Try right-clicking on the hello task to view its output logs. The relative merits of the three suite views - dot, tree, and graph - will be more apparent later when we have more tasks. Closing the GUI does not affect the suite itself.
Suites that are currently running can be detected with command line or GUI tools:
At run time, task instances are identified by name, which is determined entirely by the suite definition, and a cycle time or integer tag:
Non-cycling tasks usually just have the tag 1, but this still has to be used to target the task instance with cylc commands.
Task job scripts are generated by cylc to wrap the task implementation specified in the suite definition (environment, command scripting, etc.) in error trapping code and cylc messaging calls to report task progress back to the suite. Job scripts are saved to the suite run directory - the location can be seen in the job submission commands printed to suite stdout. They can be viewed by right-clicking on the task in the cylc GUI, or printed to the terminal:
Or a new job script can be generated on the fly for inspection,
Take a look at the job script generated for hello.1 during the suite run above. The command scripting should be clearly visible toward the bottom of the file.
The hello task in the first tutorial suite defaults to running as a background job on the suite host. To submit it to the Unix at scheduler instead, configure its job submission settings as in tut.oneoff.jobsub:
If you run the suite (first check that the at daemon atd is running on the suite host) a different, at-specific job submission command will be used and printed to stdout:
Cylc supports a number of different job submission methods. Tasks submitted to external batch queuing systems like at, PBS, SLURM, or loadleveler, will be displayed as submitted in cylc until they actually start executing.
If the --no-detach option is not used, suite stdout and stderr will be directed to the suite run directory along with the time-stamped suite log file, and task job scripts and job logs (task stdout and stderr). The default suite run directory location is $HOME/cylc-run:
The suite run database, suite environment file, suite state files, and task status files are used internally by cylc. Tasks execute in sub-directories of work/, which are automatically deleted if empty when the task finishes. The suite share/ directory is made available to all tasks (by $CYLC_SUITE_SHARE_DIR) as a common share space. Job log filenames have the task try number appended (here just 1) - this increments from 1 if a task is configured to retry on failure, to avoid overwriting the logs from previous tries.
The top level run directory location can be changed in site and user config files if necessary, and the suite share and work locations can be configured separately because of the potentially larger disk space requirement.
Task job logs can be viewed by right-clicking on tasks in the gcylc GUI (so long as the task proxy is live in the suite), manually accessed from the log directory (of course), or printed to the terminal with the cylc log command:
For a more sophisticated web-based interface to suite and task logs, see Rose in Section 14.
The hello task in the first two tutorial suites defaults to running on the suite host. To make it run on a remote host instead change its runtime configuration as in tut.oneoff.remote:
For remote task hosting to work several requirements must be satisfied:
If your username is different on the task host the [[[remote]]] section also supports an owner=username item, or your $HOME/.ssh/config file can be configured for username translation.
If you configure a task host according to the requirements above and run the suite again you’ll see that the job submission command printed to suite stdout is now considerably more complicated. That’s because it has to create remote log directories, source login scripts to ensure cylc is visible on the remote host, pipe the task job script over, and submit it to run there by the configured job submission method:
Remote task job logs are saved to the suite run directory on the task host, not on the suite host, although they can be retrieved by right-clicking on the task in the GUI. Rose (section 14.1) provides a task event handler to pull logs back to the suite host.
To make a second task called goodbye trigger after hello finishes successfully, return to the original example, tut.oneoff.basic, and change the suite graph as in tut.oneoff.goodbye:
or to trigger it at the same time as hello,
and configure the new task’s behaviour under [runtime]:
Run tut.oneoff.goodbye and check the output from the new task:
Task names in the graph string can be qualified with a state indicator to trigger off task states other than success:
A common use of this is to automate recovery from known modes of failure:
i.e. if task goodbye fails, trigger another task that (presumably) really says goodbye.
Failure triggering generally requires use of suicide triggers as well, to remove the recovery task if it isn’t required (otherwise it would hang about indefinitely in the waiting state):
This means if goodbye fails, trigger really_goodbye; and otherwise, if goodbye succeeds, remove really_goodbye from the suite.
Try running tut.oneoff.suicide, which also configures the hello task’s runtime to make it fail, to see how this works.
The [runtime] section is actually a multiple inheritance hierarchy. Each subsection is a namespace that represents a task, or if it inherits from other namespaces, a family. This allows common configuration to be factored out of related tasks very efficiently.
The [root] namespace is at the root of all runtime hierarchies. It provides defaults for all tasks in the suite. Here both tasks inherit command scripting from root, which they customize with different values of the environment variable $GREETING. Note that inheritance from root is implicit; from other parents an explicit inherit = PARENT is required, as shown below.
Task families defined by runtime inheritance can also be used as shorthand in graph trigger expressions. To see this, consider two “greeter” tasks that trigger off another task foo,
If we put the common greeting functionality of greeter_1 and greeter_2 into a special GREETERS family, the graph can be expressed more efficiently like this:
i.e. if foo succeeds, trigger all members of GREETERS at once. Here’s the full suite with runtime hierarchy shown:
Verbose validation shows the family member substitution done when the suite definition is parsed:
Experiment with the tut.oneoff.ftrigger1 suite to see how this works.
Tasks (or families) can also trigger off other families, but in this case we need to specify what the trigger means in terms of the upstream family members. Here’s how to trigger another task bar if all members of GREETERS succeed:
Verbose validation in this case reports:
Cylc ignores family member qualifiers like succeed-all on the right side of a trigger arrow, where they don’t make sense, to allow the two graph lines above to be combined in simple cases:
Any task triggering status qualified by -all or -any, for the members, can be used with a family trigger. For example, here’s how to trigger bar if all members of GREETERS finish (succeed or fail) and any of them them succeed:
(use of GREETERS:succeed-any by itself here would trigger bar as soon as any one member of GREETERS completed successfully). Verbose validation now begins to show how family triggers can simplify complex graphs, even for this tiny two-member family:
Experiment with tut.oneoff.ftrigger2 to see how this works.
You can style dependency graphs with an optional [visualization] section, as shown in tut.oneoff.ftrigger2:
To display the graph in an interactive viewer,
It should look like Figure 16 (with the GREETERS family node expanded on the right).
Graph styling can be applied to entire families at once, and custom “node groups” can also be defined for non-family groups.
The tasks in our examples so far have all had inlined implementation, in the suite definition, but real tasks often need to call external commands, scripts, or executables. To try this, let’s return to the basic Hello World suite and cut the implementation of the task hello out to a file hello.sh in the suite bin directory:
Make the task script executable, and change the hello task runtime section to invoke it:
If you run the suite now the new greeting from the external task script should appear in the hello task stdout log. This works because cylc automatically adds the suite bin directory to $PATH in the environment passed to tasks via their job scripts. To execute scripts (etc.) located elsewhere you can refer to the file by its full file path, or set $PATH appropriately yourself (this could be done via $HOME/.profile, which is sourced at the top of the task job script, or in the suite definition itself).
Note the use of set -e above to make the script abort on error. This allows the error trapping code in the task job script to automatically detect unforeseen errors.
So far we’ve considered non-cycling tasks, which finish without spawning a successor. Cycling tasks have an associated cycle time, and they spawn a successor at their next cycle time as soon as they are submitted to run (so that successive instances of a task can run in parallel if the opportunity arises and their dependencies allow it and).
Open the tut.cycling.one suite:
The difference between cycling and non-cycling suites is all in the [scheduling] section, so we will leave the [runtime] section alone for now (this will result in cycling dummy tasks). Note that the graph is now defined under an Hours Of The Day cycling section - each task in the graph section will have a succession of cycle times of ending in 00 or 12 hours, between specified initial and final cycle times (or indefinitely, if no final cycle time is given), as shown in Figure 17.
If you run this suite instances of foo will spawn in parallel out to the suite runahead limit, and each bar will trigger off the corresponding instance of foo at the same cycle time. The runahead limit prevents uncontrolled spawning of cycling tasks in suites that are not constrained by clock triggers in real time operation. The default limit is twice the shortest cycling interval in the suite. Cycling tasks can be declared sequential to prevent successive instances running in parallel, if necessary (Section 9.3.5).
Experiment with tut.cycling.one to see how cycling tasks work.
The tut.cycling.two suite adds inter-cycle dependence to the previous example:
For any given cycle time T in the sequence defined by the cycling graph section heading, bar triggers off foo as before, but now foo triggers off its own previous instance foo[T-12]. Figure 18 shows how this connects the cycling graph sections together.
Experiment with this suite to see how inter-cycle triggers work. Note that the first instance of foo, at suite start-up, will trigger immediately in spite of its inter-cycle trigger, because cylc ignores triggers that reach back beyond the initial cycle time.
The presence of an inter-cycle trigger usually implies something special has to happen at start-up, however. If a model depends on its own previous instance for restart files, for instance, then some special process will typically have to generate the initial set of restart files when there is no previous cycle to do it. The following sections illustrate several ways of handling this in cylc suites.
Asynchronous tasks are non-cycling tasks with no associated cycle time, as in tut.cycling.three:
This is shown in on the left of Figure 19.
Initially foo[T-12] will be ignored because its cycle time is earlier than the suite’s initial cycle time. In subsequent cycles dependence on the asynchronous task will be ignored and foo will trigger off its previous instance.
An alternative to an asynchronous task is a start-up task, which is a non-cycling task that nevertheless has an associated cycle time, as in tut.cycling.four:
This is shown in the right of Figure 19. Initially foo[T-12] will be ignored because its cycle time is earlier than the suite’s initial cycle time. In subsequent cycles dependence on the start-up task will be ignored and foo will trigger off its previous instance.
Special one-off cold-start tasks provide another way to handle inter-cycle dependence at start-up, illustrated by tut.cycling.five.
For any given cycle time a warm-cycled model can in principle trigger off a previous instance of itself or off a special cold start process that generates the same result, technically, in terms of restart files for the model. Cold-start tasks in cylc are intended to closely mirror this real process. Cylc somewhat arbitrarily assigns the cold-start task the same cycle time as the associated model, but a cycle time offset can be computed by the task itself if necessary.
The conditional OR trigger means this does not actually rely on cylc ignoring triggers that reach back beyond the initial cycle time. It also means dependence on the cold-start task can be retained in subsequent cycles without stalling the suite, and consequently cold-start tasks can be inserted later (cylc insert --help) to restart a model in-suite after a failure that requires missing one or more cycles. Conversely, because cylc now ignores pre-initial-cycle triggers, the cold-start OR construct is no longer necessary to bootstrap a suite with inter-cycle triggers into action - you can use the arguably simpler start-up tasks as described above.
Real suites may need a number of inter-dependent cold-start, start-up, or asynchronous tasks at start-up.
Cylc has built in support for the Jinja2 template processor, which allows us to embed code in suite definitions to generate the final result seen by cylc.
The tut.oneoff.jinja2 suite illustrates two common uses of Jinja2: changing suite content or structure based on the value of a logical switch; and iteratively generating dependencies and runtime configuration for groups of related tasks:
To view the result of Jinja2 processing with the Jinja2 flag MULTI set to False:
And with MULTI set to True:
Tasks can be configured to retry a number of times if they fail. An environment variable $CYLC_TASK_TRY_NUBMER increments from 1 on each successive try, and is passed to the task to allow different behaviour on the retry:
When a task with configured retries fails, its cylc task proxy goes into the retrying state until the next retry delay is up, then it resubmits. It only enters the failed state on a final definitive failure.
Experiment with tut.oneoff.retry to see how this works.
If you have read access to another user’s account (even on another host) it is possible to use cylc monitor to look at their suite’s progress without full shell access to their account. To do this, you will need to copy their suite passphrase to
(use of the host and owner names is optional here - Section 12.5.1) and also retrieve the port number of the running suite, which can be found in their account:
Once you have this information, you can run
to view the progress of their suite.
Other suite-connecting commands work in the same way too; see Section 12.9.
The cylc suite search tool reports pattern matches in the suite definition by line number, suite section, and file, even if the suite uses nested include-files, and by file and line number for matches in suite bin scripts:
Almost every feature of cylc can be tested quickly and easily with a simple dummy suite. You can write your own, or start from one of the example suites in /path/to/cylc/examples (see use of cylc import-examples above) - they all run “out the box” and can be copied and modified at will.
Cylc commands target suites via names registered in a suite name database located at $HOME/.cylc/REGDB/. Suite names are hierarchical like directory paths, allowing nested tree-like grouping, but use the ‘.’ character as a delimiter. This :
Suite titles held in the name database are parsed from the suite definition at the time of initial suite registration. If you change the title later use cylc db refresh to update the database.
Name groups are entirely virtual, they do not need to be explicitly created before use, and they automatically disappear if all tasks are removed from them. From the listing above, for example, to move the suite nwp.oper.region2 into the nwp.test group:
And to move nwp.test.region2 into a new group nwp.para:
Currently you cannot explicitly indicate a group name on the command line by appending a dot character. Rather, in database operations such as copy, reregister, or unregister, the identity of the source item (group or suite) is inferred from the content of the database; and if the source item is a group, so must the target be a group (or it will be, in the case of an item that will be created by the operation). This means that you cannot copy a single suite into a group that does not exist yet unless you specify the entire target suite name.
cylc db register --help shows a number of other examples.
On the command line, the ‘database’ (or ‘db’) command category contains commands to implement the aforementioned operations.
Groups of suites (at any level in the name hierarchy) can be deleted, copied, imported, and exported; as well as individual suites. To do this, just use suite group names as source and/or target for operations, as appropriate. For instance, if a group foo.bar contains the suites foo.bar.baz and foo.bar.qux, you can copy a single suite like this:
(resulting in a new suite boo); or the group like this:
(resulting in new suites boo.baz and boo.qux); or the group like this:
(resulting in new suites boo.bar.baz and boo.bar.qux). When suites are copied, the suite definition directories are copied into a directory tree, under the target directory, that reflects the suite name hierarchy. cylc copy --help has some explicit examples.
The same functionality is also available by right-clicking on suites or groups in the gcylc “Open Registered Suite” dialog.
Any client process that connects to a running suite (this includes task messaging and user-invoked interrogation and control commands) must authenticate with a secure passphrase that has been loaded by the suite. A random passphrase is generated automatically in the suite definition directory at registration time if one does not already exist there. For the default Pyro-based connection method the passphrase file must be distributed to other accounts that host running tasks or from which you need monitoring or control access to the running suite.
Alternatively, cylc can be configured to,
Neither of these methods require the suite passphrase to be installed on the task host. For ssh re-invocation ssh keys must be installed for the task-to-suite direction in addition to the suite-to-task setup already required for job submission. The automatic polling mechanism can be used as a last resort for hosts that do not allow routing back to the suite host for pyro or ssh. It can also be used as regular health check on submitted tasks under the other communications methods.
See Section 12 for more detail on cylc client/server communications, and how to use it.
Cylc suites are defined in structured, validated, suite.rc files that concisely specify the properties of, and the relationships between, the various tasks managed by the suite. This section of the User Guide deals with the format and content of the suite.rc file, including task definition. Task implementation - what’s required of the real commands, scripts, or programs that do the processing that the tasks represent - is covered in Section 10; and task job submission - how tasks are submitted to run - is in Section 11.
A cylc suite definition directory contains:
A typical example:
Suite.rc files are an extended-INI format with section nesting.
Embedded template processor expressions may also be used in the file, to programatically generate the final suite definition seen by cylc. Currently the Jinja2 template processor is supported (http://jinja.pocoo.org/docs); see Jinja2 (Section 9.6) for examples. In the future cylc may provide a plug-in interface to allow use of other template engines too.
The following defines legal suite.rc syntax:
Suites that embed Jinja2 code (Section 9.6) must process to raw suite.rc syntax.
Cylc has native support for suite.rc include-files, which may help to organize large suites. Inclusion boundaries are completely arbitrary - you can think of include-files as chunks of the suite.rc file simply cut-and-pasted into another file. Include-files may be included multiple times in the same file, and even nested. Include-file paths can be specified portably relative to the suite definition directory, e.g.:
Editing Temporarily Inlined Suites Cylc’s native file inclusion mechanism supports optional inlined editing:
The suite will be split back into its constituent include-files when you exit the edit session. While editing, the inlined file becomes the official suite definition so that changes take effect whenever you save the file. See cylc prep edit --help for more information.
Include-Files via Jinja2 Jinja2 (Section 9.6) also has template inclusion functionality.
Cylc comes with a syntax file to configure suite.rc syntax highlighting and section folding in the vim editor, as shown in Figure 11. We also have an emacs font-lock mode, and syntax files for the gedit and kate editors:
Refer to comments at the top of each file to see how to use them.
Cylc suite.rc files consist of a suite title and description followed by configuration items grouped under several top level section headings:
Cylc suite.rc files are automatically validated against a specification that defines all legal entries, values, options, and defaults. This detects formatting errors, typographic errors, illegal items and illegal values prior to run time. Some values are complex strings that require further parsing by cylc to determine their correctness (this is also done during validation). All legal entries are documented in the Suite.rc Reference (Appendix A).
The validator reports the line numbers of detected errors. Here’s an example showing a section heading with a missing right bracket:
If the suite.rc file uses include-files cylc view will show an inlined copy of the suite with correct line numbers (you can also edit suites in a temporarily inlined state with cylc edit --inline).
Validation does not check the validity of chosen job submission methods.
The [scheduling] section of a suite.rc file defines the relationships between tasks in a suite - the information that allows cylc to determine when tasks are ready to run. The most important component of this is the suite dependency graph. Cylc graph notation makes clear textual graph representations that are very concise because sections of the graph that repeat at different hours of the day, say, only have to be defined once. Here’s an example with dependencies that vary depending on cycle time:
Figure 20 shows the complete suite.rc listing alongside the suite graph. This is a complete, valid, runnable suite (it will use default task runtime properties such as command scripting).
Multiline graph strings may contain:
Suite dependency graphs can be broken down into pairs in which the left side (which may be a single task or family, or several that are conditionally related) defines a trigger for the task or family on the right. For instance the “word graph” C triggers off B which triggers off A can be deconstructed into pairs C triggers off B and B triggers off A. In this section we use only the default trigger type, which is to trigger off the upstream task succeeding; see Section 9.3.4 for other available triggers.
In the case of cycling tasks, the triggers defined by a graph string are valid for cycle times matching the list of hours specified for the graph section. For example this graph,
implies that B triggers off A for cycle times in which the hour matches 0 or 12.
To define intercycle dependencies, attach an offset indicator to the left side of a pair:
This means B[T] triggers off A[T-12] for cycle times T with hours matching 0 or 12. T must be implicit unless there is a cycle time offset - this keeps graphs clean and concise because the majority of tasks will typically depend only on others with the same cycle time. Cycle time offsets can only appear on the left of a pair, because a pairs define triggers for the right task at cycle time T. However, A => B[T-6], which is illegal, can be reformulated as a future trigger A[T+6] => B (see Section 9.3.4.10).
Triggers can be chained together. This graph:
is equivalent to this:
Each trigger in the graph must be unique but the same task can appear in multiple pairs or chains. Separately defined triggers for the same task have an AND relationship. So this:
is equivalent to this:
In summary, the branching tree structure of a dependency graph can be partitioned into lines (in the suite.rc graph string) of pairs or chains, in any way you like, with liberal use of internal white space and comments to make the graph structure as clear as possible.
Handling Long Graph Lines Long chains of dependencies can be split into pairs:
If you have very long task names, or long conditional trigger expressions (below) then you can use the suite.rc line continuation marker:
Note that a line continuation marker must be the final character on the line; it cannot be followed by trailing spaces or a comment.
A suite definition can contain multiple graph strings that are combined to generate the final graph. There are different graph VALIDITY section headings for cycling, one-off asynchronous, and repeating asynchronous tasks. Additionally, there may be multiple graph strings under different VALIDITY sections for cycling tasks with different dependencies at different cycle times.
One-off Asynchronous Tasks Figure 21 shows a small suite of one-off asynchronous tasks; these have no associated cycle time and don’t spawn successors (once they’re all finished the suite just exits). The integer 1 attached to each graph node is just an arbitrary label, akin to the task cycle time in cycling tasks; it increments when a repeating asynchronous task (below) spawns.
Cycling Tasks For cycling tasks the graph VALIDITY section heading defines a sequence of cycles times for which the subsequent graph section is valid. Figure 22 shows a small suite of cycling tasks.
Stepped Daily, Monthly, And Yearly Cycling In addition to the original hours-of-the-day section headings, cylc now has an extensible cycling mechanism and (so far) stepped daily, monthly, and yearly cycling modules:
The section heading arguments here are an anchor datetime and an integer step. The cycle sequence always passes through the anchor regardless of the suite’s initial cycle time. So, for example, Yearly(2010,3) defines a 3-yearly sequence that always lands on the year 2010, not 2011 or 2012, regardless of the initial cycle time - which can be before or after 2010.
Note that hours-of-the-day graph section headings can also be written to explicitly reference the associated cycling module:
How Multiple Graph Strings Combine For a cycling graph with multiple validity sections for different hours of the day, the different sections add to generate the complete graph. Different graph sections can overlap (i.e. the same hours may appear in multiple section headings) and the same tasks may appear in multiple sections, but individual dependencies should be unique across the entire graph. For example, the following graph defines a duplicate prerequisite for task C:
This does not affect scheduling, but for the sake of clarity and brevity the graph should be written like this:
Combined Asynchronous And Synchronous Graphs Cycling tasks can be made to wait on one-off asynchronous tasks, as shown in Figure 23. Alternatively, they can be made to wait on one-off synchronous start-up tasks, which have an associated cycle time even though they are non-cycling - see Figure 24.
Synchronous Start-up vs One-off Asynchronous Tasks One-off synchronous start-up tasks run only when a cycling suite is cold-started and they are often associated with subsequent one-off cold-start tasks used to bootstrap a cycling suite into existence.
The distinction between cold- and warm-start is only meaningful for cycling tasks, and one-off asynchronous tasks may be best used in constructing entirely non-cycling suites.
However, one-off asynchronous tasks can precede cycling tasks in the same suite, as shown above. It seems likely that, if used in this way, they will be intended as start-up tasks - so currently one-off asynchronous tasks only run in a cold-start.
Repeating Asynchronous Tasks Repeating asynchronous tasks can be used, for example, to process satellite data that arrives at irregular time intervals. Each new dataset must have a unique “asynchronous ID”. If it doesn’t naturally have such an ID a string representation of the data arrival time could be used. The graph VALIDITY section heading must contain “ASYNCID:” followed by a regular expression that matches the actual IDs. Additionally, one task in the suite must be a designated “daemon” that waits indefinitely on incoming data and reports each new dataset (and its ID) back to the suite by means of a special output message. When the daemon task proxy receives a matching message it dynamically registers a new output (containing the ID) that downstream tasks can then trigger off. The downstream tasks likewise have prerequisites containing the ID pattern (because they trigger off the aforementioned outputs) and when these get satisfied during dependency negotiation the actual ID is substituted into their own registered outputs. Finally, each asynchronous repeating task proxy passes the ID to its task execution environment as $ASYNCID to allow identification of the correct dataset by task scripts. In this way a tree of tasks becomes dedicated to processing each new dataset, and multiple datasets can be processed in parallel if they become available in quick succession. As Figure 25 shows, a repeating asynchronous suite currently plots just like a one-off asynchronous suite. But at run time the daemon task stays put, while the others continually spawn successors to wait for new datasets to come in. The asynchronous.repeating example suite demonstrates how to do this in a real suite. Note that other trigger types (success, failure, start, suicide, and conditional) cannot currently be used in a repeating asynchronous graph section.

Trigger type, indicated by :type after the upstream task (or family) name, determines what kind of event results in the downstream task (or family) triggering.
Success Triggers The default, with no trigger type specified, is to trigger off the upstream task succeeding:
For consistency and completeness, however, the success trigger can be explicit:
Failure Triggers To trigger off the upstream task reporting failure:
Section 9.3.4.8 (Suicide Triggers) shows one way of handling task B here if A does not fail.
Start Triggers To trigger off the upstream task starting to execute:
This can be used to trigger tasks that monitor other tasks once they (the target tasks) start executing. Consider a long-running forecast model, for instance, that generates a sequence of output files as it runs. A postprocessing task could be launched with a start trigger on the model (model:start => post) to process the model output as it becomes available. Note, however, that there are several alternative ways of handling this scenario: both tasks could be triggered at the same time (foo => model & post), but depending on external queue delays this could result in the monitoring task starting to execute first; or a different postprocessing task could be triggered off an internal output for each data file (model:out1 => post1 etc.; see Section 9.3.4.5), but this may not be practical if the number of output files is large or if it is difficult to add cylc messaging calls to the model.
Finish Triggers To trigger off the upstream task succeeding or failing, i.e. finishing one way or the other:
Internal (Message) Triggers These allow triggering off off events that occur while a task runs. A special event message must be registered in the suite definition, and deliberately sent by the task at the appropriate time.
Task A must emit this message when the actual output has been completed - see Reporting Internal Outputs Completed (Section 10.3).
Job Submission Triggers It is also possible to trigger off a task submitting, or failing to submit:
A possible use case for submit-fail triggers: if a task goes into the submit-failed state, possibly after several job submission retries, another task that inherits the same runtime but sets a different job submission method and/or host could be triggered to, in effect, run the same job on a different platform.
Conditional Triggers AND operators (&) can appear on both sides of an arrow. They provide a concise alternative to defining multiple triggers separately:
OR operators (|) which result in true conditional triggers, can only appear on the left,2
Forecasting suites typically have simple conditional triggering requirements, but any valid conditional expression can be used, as shown in Figure 26 (conditional triggers are plotted with open arrow heads).
Suicide Triggers Suicide triggers take tasks out of the suite. This can be used for automated failure recovery. The suite.rc listing and accompanying graph in Figure 27 show how to define a chain of failure recovery tasks that trigger if they’re needed but otherwise remove themselves from the suite (you can run the AutoRecover.async example suite to see how this works). The dashed graph edges ending in solid dots indicate suicide triggers, and the open arrowheads indicate conditional triggers as usual.

Note that multiple suicide triggers combine in the same way as other triggers, so this:
is equivalent to this:
i.e. both foo and bar must succeed for baz to be taken out of the suite. If you really want a task to be taken out if any one of several events occurs then be careful to write it that way:
Family Triggers Families defined by the namespace inheritance hierarchy (Section 9.4) can be used in the graph trigger whole groups of tasks at the same time (e.g. forecast model ensembles and groups of tasks for processing different observation types at the same time) and for triggering downstream tasks off families as a whole. Higher level families, i.e. families of families, can also be used, and are reduced to the lowest level member tasks. Note that tasks can also trigger off individual family members if necessary.
To trigger an entire task family at once:
This is equivalent to:
To trigger other tasks off families we have to specify whether to triggering off all members starting, succeeding, failing, or finishing, or off any members (doing the same). Legal family triggers are thus:
Here’s how to trigger downstream processing after if one or more family members succeed, but only after all members have finished (succeeded or failed):
Intercycle Triggers Typically most tasks in a suite will trigger off others in the same cycle time, but some may depend on others with other cycle times. This notably applies to warm-cycled forecast models, which depend on their own previous instances (see below); but other kinds of intercycle dependence are possible too.3 Here’s how to express this kind of relationship in cylc:
Intercycle and trigger type (and internal output) notation can be combined:
At suite start-up inter-cycle triggers refer to a previous cycle that does not exist. This does not cause the dependent task to wait indefinitely, however, because cylc ignores triggers that reach back beyond the initial cycle time. That said, the presence of an inter-cycle trigger does normally imply that something special has to happen at start-up. If a model depends on its own previous instance for restart files, for instance, then an initial set of restart files has to be generated somehow or the first model task will presumably fail with missing input files. There are several ways to handle this in cylc using different kinds of one-off (non-cycling) tasks that run at suite start-up. They are illustrated in Tutorial Section 7.23.1; to summarize here briefly:
The first two cases are the same, except that start-up tasks are assigned a cycle time (even thought they don’t cycle) whereas asynchronous tasks are not. In the first cycle the previous-cycle trigger is ignored and the first cycling tasks trigger off the initial tasks; subsequently dependence on the initial tasks is ignored and the inter-cycle trigger takes effect. Cold-start tasks, on the other hand, can be used for real model cold-start processes, whereby a warm-cycled model at any given cycle time can in principle have its inputs satisfied by a previous instance of itself, or by a cold-start task with (nominally) the same cycle time. In effect, the cold-start task masquerades as the previous-cycle trigger of its associated cycling task. At suite start-up cold-start tasks will trigger the first cycling tasks, and thereafter the inter-cycle trigger will take effect. Unlike for asynchronous and start-up initial tasks, however, the cold-start “OR” construct means that cold-start triggers don’t have to be ignored by cylc after the first cycle, so it is possible to insert cold-start task into a suite mid-run to do mid-stream cold-starts after problems that preclude continued normal warm cycling.
One-off initial tasks can invoke real processing to generate the files that are subsequently produced by a tasks in the previous cycle; or they could be dummy tasks that represent some external process that does the same before the suite is started - in which case the initial task can just report itself successfully completed after checking that the required files are present.
Warm-Starting Suites For suites with inter-cycle dependence a warm-start is essentially an implicit restart. Rather than loading tasks from a previous recorded suite state, it loads all cycling tasks at a given cycle time assuming that the previous cycle was completed in an earlier suite run. Any initial tasks - asynchronous, start-up, or cold-start - therefore do not need to run again. Dependence on tasks from before the start cycle is still ignored, but cold-start tasks have to be loaded in the succeeded state because dependence on them (in the cold-start OR construct) is retained throughout the suite run as is explained above in Section 9.3.4.10.
Future Triggers Cylc also supports inter-cycle triggering off tasks in the future (with respect to cycle time!):
In contrast to normal inter-cycle triggers, future triggers present a problem at the suite stop time rather than at start-up - in the final cycle B wants to to trigger off A at a future cycle time that does not exist. To avoid this problem cylc prevents tasks from spawning successors that depend on tasks in a non-existent future cycle.
Warm cycled forecast models generate restart files, e.g. model background fields, that are required to initialize the next forecast (this is essentially the definition of “warm cycling”). In fact restart files will often be written for a whole series of subsequent cycles in case the next cycle (or the next and the next-next, and so on) cycle has to be omitted:
In other words, task A can trigger off a cotemporal cold-start task, or off its own previous instance, or off the instance before that, and so on. Restart dependencies are unusual because although A could trigger off A[T-12] we don’t actually want it to do so unless A[T-6] fails and can’t be fixed. This is why Task A, above, is declared to be ‘sequential’.4 Sequential tasks do not spawn a successor until they have succeeded (by default, tasks spawn as soon as they start running in order to get maximum functional parallelism in a suite) which means that A[T+6] will not be waiting around to trigger off an older predecessor while A[T] is still running. If A[T] fails though, the operator can force it, on removal, to spawn A[T+6], whose restart dependencies will then automatically be satisfied by the older instance, A[T-6].
Forcing a model to run sequentially means, of course, that its restart dependencies cannot be violated anyway, so we might just ignore them. This is certainly an option, but it should be noted that there are some benefits to having your suite reflect all of the real dependencies between the tasks that it is managing, particularly for complex multi-model operational suites in which the suite operator might not be an expert on the models. Consider such a suite in which a failure in a driving model (e.g. weather) precludes running one or more cycles of the downstream models (sea state, storm surge, river flow, …). If the real restart dependencies of each model are known to the suite, the operator can just do a recursive purge to remove the subtree of all tasks that can never run due to the failure, and then cold-start the failed driving model after a gap (skipping as few cycles as possible until the new cold-start input data are available). After that the downstream models will kick off automatically so long as the gap is spanned by their respective restart files, because their restart dependencies will automatically be satisfied by the older pre-gap instances in the suite. Managing this kind of scenario manually in a complex suite can be quite difficult.
Finally, if a warm cycled model is declared to have explicit restart outputs, and is not declared to be sequential, and you define appropriate labeled restart outputs which must contain the word ‘restart’, then the task will spawn as soon its last restart output is completed so that successives instances of the task will be able to overlap (i.e. run in parallel) if the opportunity arises. Whether or not this is worth the effort depends on your needs.
The [runtime] section of a suite definition configures what to execute (and where and how to execute it) when each task is ready to run, in a multiple inheritance hierarchy of namespaces culminating in individual tasks. This allows all common configuration detail to be factored out and defined in one place.
Any namespace can configure any or all of the items defined in the Suite.rc Reference, Appendix A.
Namespaces that do not explicitly inherit from others automatically inherit from the root namespace (below).
Nested namespaces define task families that can be used in the graph as convenient shorthand for triggering all member tasks at once, or for triggering other tasks off all members at once - see Family Triggers, Section 9.3.4.9. Nested namespaces can be progressively expanded and collapsed in the dependency graph viewer, and in the gcylc graph and tree views. Only the first parent of each namespace (as for single-inheritance) is used for suite visualization purposes.
Namespace names may contain letters, digits, underscores, and hyphens.
Note that task names need not be hardwired into task implementations because task and suite identity can be extracted portably from the task execution environment supplied by cylc (Section 9.4.7) - then to rename a task you can just change its name in the suite definition.
The root namespace, at the base of the inheritance hierarchy, provides default configuration for all tasks in the suite. Most root items are unset by default, but some have default values sufficient to allow test suites to be defined by dependency graph alone. The command scripting item, for example, defaults to code that prints a message then sleeps for between 1 and 15 seconds and exits. Default values are documented with each item in Appendix A. You can override the defaults or provide your own defaults by explicitly configuring the root namespace.
If a namespace section heading is a comma-separated list of names then the subsequent configuration applies to each list member. Particular tasks can be singled out at run time using the $CYLC_TASK_NAME variable.
As an example, consider a suite containing an ensemble of closely related tasks that each invokes the same script but with a unique argument that identifies the calling task name:
For large ensembles Jinja2 template processing can be used to automatically generate the member names and associated dependencies (see Section 9.6).
The following listing of the inherit.single.one example suite illustrates basic runtime inheritance with single parents.
If a namespace inherits from multiple parents the linear order of precedence (which namespace overrides which) is determined by the so-called C3 algorithm used to find the linear method resolution order for class hierarchies in Python and several other object oriented programming languages. The result of this should be fairly obvious for typical use of multiple inheritance in cylc suites, but for detailed documentation of how the algorithm works refer to the official Python documentation here: http://www.python.org/download/releases/2.3/mro/.
The inherit.multi.one example suite, listed here, makes use of multiple inheritance:
cylc get-config provides an easy way to check the result of inheritance in a suite. You can extract specific items, e.g.:
or use the --sparse option to print entire namespaces without obscuring the result with the dense runtime structure obtained from the root namespace:
Suite Visualization And Multiple Inheritance The first parent inherited by a namespace is also used as the collapsible family group when visualizing the suite. If this is not what you want, you can demote the first parent for visualization purposes, without affecting the order of inheritance of runtime properties:
The linear precedence order of ancestors is computed for each namespace using the C3 algorithm. Then any runtime items that are explicitly configured in the suite definition are “inherited” up the linearized hierachy for each task, starting at the root namespace: if a particular item is defined at multiple levels in the hiearchy, the level nearest the final task namespace takes precedence. Finally, root namespace defaults are applied for every item that has not been configured in the inheritance process (this is more efficient than carrying the full dense namespace structure through from root from the beginning).
The task execution environment contains suite and task identity variables provided by cylc, and user-defined environment variables. The environment is explicitly exported (by the task job script) prior to executing task command scripting (see Task Job Submission, Section 11).
Suite and task identity are exported first, so that user-defined variables can refer to them. Order of definition is preserved throughout so that variable assignment expressions can safely refer to previously defined variables.
Additionally, access to cylc itself is configured prior to the user-defined environment, so that variable assignment expressions can make use of cylc utility commands:
User Environment Variables A task’s user-defined environment results from its inherited [[[environment]]] sections:
This results in a task foo with SHAPE=circle, COLOR=blue, and TEXTURE=rough in its environment.
Overriding Environment Variables When you override inherited namespace items the original parent item definition is replaced by the new definition. This applies to all items including those in the environment sub-sections which, strictly speaking, are not “environment variables” until they are written, post inheritance processing, to the task job script that executes the associated task. Consequently, if you override an environment variable you cannot also access the original parent value:
The compressed variant of this, COLOR = dark-$COLOR, is also in error for the same reason. To achieve the desired result you must use a different name for the parent variable:
Suite And Task Identity Variables The task identity variables provided to tasks by cylc are:
And the suite identity variables are:
Some of these variables are also used by cylc task messaging commands in order to target the right task proxy object in the right suite.
Suite Share And Task Work Directories A suite share directory is created automatically for use as a file exchange area for tasks on same task host. It can be accessed via $CYLC_SUITE_SHARE_DIR and its location can be set in the cylc site and user config files.
A task work directory is also created automatically for each task, and can be accessed via the $CYLC_TASK_WORK_DIR variable. Task command scripting is executed from within the work directory (i.e. it is the task’s current working directory). For non-detaching tasks the work directory is automatically removed again if it is empty when the task finishes. The main work directory location is set in the cylc site and user config files, but the lowest-level sub-directory, which name defaults to the task ID to give each task a unique workspace, can be overridden under [runtime] in suite definitions. This enables groups of tasks that read and write files from their current working directories to be given common work directories as file share spaces.
Other Cylc-Defined Environment Variables Initial and final cycle times, if supplied via the suite.rc file or the command line, are passed to task execution environments as:
Tasks can use these to determine whether or not they are running in the first or final cycles.
Environment Variable Evaluation Variables in the task execution environment are not evaluated in the shell in which the suite is running prior to submitting the task. They are written in unevaluated form to the job script that is submitted by cylc to run the task (Section 11.2) and are therefore evaluated when the task begins executing under the task owner account on the task host. Thus $HOME, for instance, evaluates at run time to the home directory of task owner on the task host.
Tasks can use $CYLC_SUITE_DEF_PATH to access suite files on the task host, and the suite bin directory is automatically added $PATH. If a remote suite definition directory is not specified the local (suite host) path will be assumed with the local home directory, if present, swapped for literal $HOME for evaluation on the task host.
If a task declares an owner other than the suite owner and/or a host other than the suite host, cylc will use passwordless ssh to execute the task on the owner@host account by the configured job submission method,
For this to work,
To learn how to give remote tasks access to cylc, see Section 12.6.
Tasks running on the suite host under another user account are treated as remote tasks.
Remote hosting, like all namespace settings, can be declared globally in the root namespace, or per family, or for individual tasks.
Dynamic Host Selection Instead of hardwiring host names into the suite definition you can specify a shell command that prints a hostname, or an environment variable that holds a hostname, as the value of the host config item. See Section A.4.1.19.1.
Remote Task Log Directories Task stdout and stderr streams are written to log files in a suite-specific sub-directory of the suite run directory, as explained in Section 11.4. For remote tasks the same directory is used, but on the task host. Remote task log directories, like local ones, are created on the fly, if necessary, during job submission.
The visualization section of a suite definition is used to configure suite graphing, principally graph node (task) and edge (dependency arrow) style attributes. Tasks can be grouped for the purpose of applying common style attributes. See the suite.rc reference (Appendix A) for details.
Nested families from the runtime inheritance hierarchy can be expanded and collapsed in suite graphs and the gcylc graph view. All families are displayed in the collapsed state at first, unless [visualization]collapsed families is used to single out specific families for initial collapsing.
In the gcylc graph view, nodes outside of the main graph (such as the members of collapsed families) are plotted as rectangular nodes to the right if they are doing anything interesting (submitted, running, failed).
Figure 28 illustrates successive expansion of nested task families in the namespaces example suite.






Cylc has built in support for the Jinja2 template processor in suite definitions. Jinja2 variables, mathematical expressions, loop control structures, conditional logic, etc., are automatically processed to generate the final suite definition seen by cylc.
The need for Jinja2 processing must be declared with a hash-bang comment as the first line of the suite.rc file:
Potential uses for this include automatic generation of repeated groups of similar tasks and dependencies, and inclusion or exclusion of entire suite sections according to the value of a single flag. Consider a large complicated operational suite and several related parallel test suites with slightly different task content and structure (the parallel suites, for instance, might take certain large input files from the operation or the archive rather than downloading them again) - these can now be maintained as a single master suite definition that reconfigures itself according to the value of a flag variable indicating the intended use.
Template processing is the first thing done on parsing a suite definition so Jinja2 expressions can appear anywhere in the file (inside strings and namespace headings, for example).
Jinja2 is well documented at http://jinja.pocoo.org/docs, so here we just provide an example suite that uses it. The meaning of the embedded Jinja2 code should be reasonably self-evident to anyone familiar with standard programming techniques.
The jinja2.ensemble example, graphed in Figure 29, shows an ensemble of similar tasks generated using Jinja2:
Here is the generated suite definition, after Jinja2 processing:
And finally, the jinja2.cities example uses variables, includes or excludes special cleanup tasks according to the value of a logical flag, and it automatically generates all dependencies and family relationships for a group of tasks that is repeated for each city in the suite. To add a new city and associated tasks and dependencies simply add the city name to list at the top of the file. The suite is graphed, with the New York City task family expanded, in Figure 30.
This functionality is not provided by Jinja2 by default, but cylc automatically imports the user environment to the template in a dictionary structure called environ. A usage example:
This example is emphasizes that the environment is read on the suite host at the time the suite definition is parsed - it is not, for instance, read at task run time on the task host.
Jinja2 variable values can be modified by “filters”, using pipe notation. For example, the built-in trim filter strips leading and trailing white space from a string:
(See official Jinja2 documentation for available built-in filters.)
Cylc also supports custom Jinja2 filters. A custom filter is a single Python function in a source file with the same name as the function (plus “.py” extension) and stored in one of the following locations:
In the filter function argument list, the first argument is the variable value to be “filtered”, and subsequent arguments can be whatever is needed. Currently there is one custom filter called “pad” in the central cylc Jinja2 filter directory, for padding string values to some constant length with a fill character - useful for generating task names and related values in ensemble suites:
Associative arrays (dicts in Python) can be very useful. Here’s an example, from $CYLC_DIR/examples/jinja2/dict:
Here’s the result:
The values of Jinja2 variables can be passed in from the cylc command line rather than hardwired in the suite definition. Here’s an example, from $CYLC_DIR/examples/jinja2/defaults:
Here’s the result:
Note also that cylc view --set FIRST_TASKbob –jinja2 SUITE= will show the suite with the Jinja2 variables as set.
Warning: suites started with template variables set on the command line do not currently restart with the same settings - you have to set them again on the cylc restart command line.
Several special variables are used as placeholders in cylc suite definitions:
To use proper variables (c.f. programming languages) in suite definitions, see the Jinja2 template processor (Section 9.6).
It is sometimes convenient to omit certain tasks from the suite at runtime without actually deleting their definitions from the suite.
Defining [runtime] properties for tasks that do not appear in the suite graph results in verbose-mode validation warnings that the tasks are disabled. They cannot be used because the suite graph is what defines their dependencies and valid cycle times. Nevertheless, it is legal to leave these orphaned runtime sections in the suite definition because it allows you to temporarily remove tasks from the suite by simply commenting them out of the graph.
To omit a task from the suite at runtime but still leave it fully defined and available for use (by insertion or cylc submit) use one or both of [scheduling][[special task]] lists, include at start-up or exclude at start-up (documented in Sections A.3.5.8 and A.3.5.7). Then the graph still defines the validity of the tasks and their dependencies, but they are not actually inserted into the suite at start-up. Other tasks that depend on the omitted ones, if any, will have to wait on their insertion at a later time or otherwise be triggered manually.
Finally, with Jinja2 (Section 9.6) you can radically alter suite structure by including or excluding tasks from the [scheduling] and [runtime] sections according to the value of a single logical flag defined at the top of the suite.
A naked dummy task appears in the suite graph but has no explicit runtime configuration section. Such tasks automatically inherit the default “dummy task” configuration from the root namespace. This is very useful because it allows functional suites to be mocked up quickly for test and demonstration purposes by simply defining the graph. It is somewhat dangerous, however, because there is no way to distinguish an intentional naked dummy task from one generated by typographic error: misspelling a task name in the graph results in a new naked dummy task replacing the intended task in the affected trigger expression; and misspelling a task name in a runtime section heading results in the intended task becoming a dummy task itself (by divorcing it from its intended runtime config section).
To avoid this problem any dummy task used in a real suite should not be naked - i.e. it should have an explicit entry in under the runtime section of the suite definition, even if the section is empty. This results in exactly the same dummy task behaviour, via implicit inheritance from root, but it allows use of cylc validate --strict to catch errors in task names by failing the suite if any naked dummy tasks are detected.
Existing tasks (models, scripts, etc.) can be used by cylc without any modification, with the following few exceptions:
Simple tasks can be entirely implemented within the suite.rc file - task command scripting can be a multi-line string.
Tasks should abort with non-zero exit status if a fatal error occurs (this is just standard coding practice anyway). This allows cylc’s task job scripts to automatically trap errors and send a cylc task failed message back to the suite. The shell set -e option can be used in lieu of explicit error checks for every command:
If a task has internal outputs that others need to trigger off then it must report completion of those outputs at the appropriate time. Output messages must be unique within the suite or else downstream tasks will trigger off whichever task happens to send the message first; they must exactly match the corresponding outputs registered for the task in the suite definition; and for cycling tasks they must contain the cycle time in order to distinguish between the same outputs of the same task at other cycle times.
The “outputs” example is a self-contained suite that illustrates this:
Note the use of [T] as a placeholder for cycle time in messages registered under [[[outputs]]] these strings are held inside cylc for comparison with incoming task messages; they are never interpreted by the shell and may not contain shell environment variables. The actual messaging calls made by running tasks, on the other hand, can make use of variables in the task runtime environment.
General (non-output) messages can also be sent to report progress, warnings, and so on, e.g.:
Explanatory messages can be sent before aborting on error:
Or equivalently, with different syntax:
But not this:
If critical errors are not reported in this way task failures will still be detected and logged by cylc, but you may have to examine task logs to determine what the problem was.
If a task spawns another job internally and then detaches and exits without seeing the spawned process through, you must arrange for the detached process to send its own completion messages, because the cylc-generated job script cannot know when it is finished.
First check that you can’t “reconnect” the detaching process. If it is a background shell process, for instance, just run it in the foreground instead. For loadleveler jobs the -s option prevents llsubmit from returning until the job has completed. For Sun Grid Engine, qsub -sync yes has the same effect. Section 11.5 shows how to override job submission command template to achieve this.
If the detaching process cannot be reconnected, disable cylc’s automatic completion messaging:
The cylc messaging commands are called like this:
They read environment variables that identify the calling task and the target suite, so the task execution environment must be propagated to the deatched process.
One way to handle this is to write a task wrapper that modifies a copy of the detaching native job scripts, on the fly, to insert completion messaging in the appropriate places. An advantage of this method is that you don’t need to permanently modify the model or its associated native scripting for cylc. Another is that you can configure the native job setup for a single test case (running it without cylc) and then have your custom wrapper modify the standalone test case on the fly with suite, task, and cycle-specific parameters as required.
To make this easier, for tasks that declare manual completion messaging cylc makes non user-defined environment scripting available in a variable $CYLC_SUITE_ENVIRONMENT, the value of which can be inserted at the appropriate point in the task scripts (just prior to calling the cylc messaging commands as above).5
Another reason to avoid detaching tasks if possible is that they cannot be polled or killed because there is no way for cylc to determine the job ID of the detached process. Attempted polling of a detaching task will just result in cylc logging a warning message.
The detaching example suite contains a script model.sh that runs a pseudo model as follows:
this is in turn executed by a script run-model.sh that detaches immediately after job submission (i.e. it exits before the model executable actually runs):
Note that your at scheduler daemon must be up if you want to test this suite.
Here’s a cylc suite to run this unruly model:
The suite invokes the task by means of the custom wrapper model-wrapper.sh which modifies, on the fly, a temporary copy of the model’s native job scripts as described above:
If you run this suite, or submit the model task alone with cylc submit, you’ll find that the usual job submission log files for task stdout and stderr end before the task is finished. To see the “model” output and the final task completion message (success or failure), examine the log files generated by the job submitted internally to the at scheduler (their location is determined by the $PREFIX variable in the suite.rc file).
It should not be difficult to adapt this example to real tasks with detaching internal job submission. You will probably also need to replace other parameters, such as model input and output filenames, with suite- and cycle-appropriate values, but exactly the same technique can be used: identify which job script needs to be modified and use text processing tools (such as the single line perl search-and-replace expressions above) to do the job.
Task Implementation (Section 10) describes what requirements a command, script, or program, must fulfill in order to function as a cylc task. This section explains how tasks are submitted by cylc when they are ready to run, and how to define new task job submission methods.
For most job submission methods cylc now supports polling for real task status, and job kill, from the gcylc GUI and command line (cylc poll and cylc kill). In addition to on-demand polling, submitted and running tasks are polled automatically on suite restart (Section 12.7) and on job submission and execution timeouts, and one-way polling can be used as regular health check for submitted tasks, and to track tasks on hosts that do not allow return routing for task messaging (Section 12).
Task poll and kill support has not yet been added to the SGE and slurm job submission methods. It will be added in an upcoming release.
When a task is ready to run cylc generates a temporary task job script to configure the task’s execution environment and call its command scripting. The job script is the embodiment of all suite.rc runtime settings for the task. It is submitted to run by the job submission method configured for the task. Different tasks can have different job submission methods. Like other runtime properties, you can set a suite default job submission method and override it for specific tasks or families:
As shown in the Tutorial Section 7.11, job scripts are saved to the suite run directory; the commands used to submit them are printed to stdout by cylc; and they can be printed with the cylc log command or new ones generated and printed with the cylc jobscript command. Take a look at one to see exactly how cylc wraps and runs your tasks.
Cylc supports a number of commonly used job submission methods, and Section 11.6 shows how to add support for other user-defined job submission methods.
Runs tasks directly in a background shell.
Submits tasks to the rudimentary Unix at scheduler. The atd daemon must be running.
Submits tasks to loadleveler by the llsubmit command. Loadleveler directives can be provided in the suite.rc file:
These are written to the top of the task job script like this:
Submits tasks to PBS (or Torque) by the qsub command. PBS directives can be provided in the suite.rc file:
These are written to the top of the task job script like this:
Submits tasks to Sun/Oracle Grid Engine by the qsub command. SGE directives can be provided in the suite.rc file:
These are written to the top of the task job script like this:
Submits tasks to Simple Linux Utility for Resource Management by the sbatch command. SLURM directives can be provided in the suite.rc file (note that since not all SLURM commands have a short form, cylc requires the long form directives):
These are written to the top of the task job script like this:
For job submission methods that use job file directives (PBS, Loadlevler, etc.) default directives are provided to set the job name and stdout and stderr file paths.
As shown in the example above, multiple entries for the same PBS or SGE directive option must be comma-separated on the same line, in the suite.rc file. Otherwise, repeating the option on another line will override the previous entry, not add to it. Also, the right-hand side must be quoted to hide the comma from the suite.rc parser (commas indicate list values, whereas directives are treated as singular).
As also shown in the example above, to get a naked option flag such as -cwd in SGE you must give a quoted blank space as the directive value in the suite.rc file.
When a task is ready to run cylc generates a filename root to be used for the task job script and log files. The filename containing the task name, cycle time (or integer tag), and a submit number that increments if the same task is re-triggered multiple times:
How the stdout and stderr streams are directed into these files depends on the job submission method. The background method just uses appropriate output redirection on the command line, as shown above. The loadleveler method writes appropriate directives to the job script that is submitted to loadleveler.
Cylc obviously has no control over the stdout and stderr output from tasks that do their own internal output management (e.g. tasks that submit internal jobs and direct the associated output to other files). For less internally complex tasks, however, the files referred to here will be complete task job logs.
To change the form of the actual command used to submit a job you do not need to define a new job submission method; just override the command template in the relevant job submission sections of your suite.rc file:
As explained in the suite.rc reference (Appendix A), the template’s first %s will be substituted by the job file path and, where applicable a second and third %s will be substituted by the paths to the job stdout and stderr files.
Defining a new job submission method requires a little Python programming. You can derive (in the sense of object oriented programming inheritance) new methods from one of the existing ones, or directly from cylc’s job submission base class,
using the existing job submission methods as examples.
The following user-defined job submission class, called qsub, overrides the built-in pbs class to change the directive prefix from #PBS to #QSUB:
To check that this works correctly save the new source file to qsub.py in one of the allowed locations (see just below), use it in a suite definition:
and generate a job script to see the resulting directives:
You new job submission class code should be saved to a file with the same name as the class (plus “.py” extension). It can reside in any of the following locations, depending on how generally useful the new method is and whether or not you have write-access to the cylc source tree:
Note that the form of the import statement at the top of the new user-defined Python module differs depending on whether or not the file is installed in the cylc source tree (see the comment at the top of the example file above).
To learn how to control running suites please also see the Tutorial (Section 7, command documentation (Section C), and experiment with plenty of test suites.
Cylc has three ways of tracking the progress of tasks, configured per task host in the site and user config files (Section 6). All three methods can be used on different task hosts within the same suite if necessary.
The Pyro communication method is the default because it is the most direct and efficient; the ssh method inserts an extra step in the process (command re-invocation on the suite host); and task polling is the least efficient because results are checked at predetermined intervals, not when task events actually occur.
Be careful to avoid spamming task hosts with polling commands. Each poll opens (and then closes) a new ssh connection. Polling subprocesses are batched by cylc, and the number invoked at once can be configured in the suite definition:
Polling intervals are configurable here because they should be appropriate to the expected task run length. For instance, a task that typically takes an hour to run might be polled every 10 minutes initially, and then every minute toward the end of its run. Interval values are used in turn until the last value, which is used repeatedly until finished:
A list of intervals with optional multipliers can be used for both submission and execution polling, although a single value is probably sufficient for submission polling. If these items are not configured default values from site and user config will be used for the polling task communication method; polling is not done by default under the other task communications methods (but it can still be used if you like).
Polling is also done automatically once on job submission and execution timeouts, to see if the timed-out task has failed or not; and on suite restarts, to see what happened to any tasks that were orphaned when the suite went down.
If Pyro and ssh ports are blocked but you don’t want to use polling from the suite host,
Here are the site and user config items relevant to task tracking:
User-invoked commands that connect to running suites can also choose between direct communication across network sockets (Pyro) and re-invocation of commands on the suite host using passwordless ssh (there is a --use-ssh command option for this purpose).
The gcylc GUI requires direct Pyro connections to its target suite. If that is not possible, run gcylc on the suite host.
All Pyro connections to a running suite (task messaging and user-invoked commands) must authenticate with an arbitary single line of text in a file called passphrase, which will be found and used automatically if installed properly - see below. A secure MD5 checksum, not the raw passphrase, is passed across the network. A random passphrase is generated in the suite definition directory when a suite is registered, but you can create your own if you wish.
For ssh task messaging and user command re-invocation, on the other hand, the suite passphrase is only required on the suite host account but ssh keys must be installed for passwordless connections instead.
Suite passphrases currently have to be installed manually to all task host accounts that use the Pyro communication method (see above); and also to accounts used to run commands that interact directly with the suite via Pyro.
Legal passphrase locations, in order of preference, are:
Remote tasks know the location of the remote suite definition directory (if one exists) through their execution environment. Local (suite host) user command invocations can find the suite definition directory in the suite name database. Remote user command invocations, however, cannot interrogate the database on the command host because the suite will not be registered there (cylc cannot assume that the command host shares a common filesystem with the suite host). Consequently remote command host accounts must have the suite passphrase installed in one of the secondary locations under $HOME/.cylc/.
Running tasks need access to cylc via $PATH, principally for the task messaging commands. To allow this, the first thing a task job script does is set $CYLC_VERSION to the cylc version number of the running suite. If you need to run several suites at once under different incompatible versions of cylc, check that your site is using the cylc version wrapper (see INSTALL and admin/cylc-wrapper in a cylc installation) then set $CYLC_VERSION to the desired version. In the case of developers wishing to run their own copy of cylc rather than a centrally installed one, set $CYLC_HOME to point to your cylc copy.
A restarted suite (see cylc restart --help) is initialized from a previous recorded suite state dump so that it can carry on from wherever it got to before being shut down or killed.
Tasks that were recorded in the submitted or running states are now automatically polled on restart, to see if they are still submitted (e.g. waiting in a PBS batch queue or similar), still running, or if they finished (succeeded or failed) while the suite was down.
Tasks recorded in the failed state at shutdown are not automatically resubmitted on restarting the suite, in case the underlying problem has not been addressed yet.
As a suite runs its task proxies transition through the following states:
Note that greyed-out “base graph nodes” in the gcylc graph view do not represent task states; they are displayed to fill out the graph structure where corresponding task proxies do not currently exist in the live task pool.
For manual task state reset purposes ready is a pseudo-state that means waiting with all prerequisites satisfied.
Connecting to a running suite requires knowing the network port it is listening on, and the suite passphrase to authenticate with once a connection is made to the port.
Suites write their port number to $HOME/.cylc/ports/<SUITE> at start-up, and suite-connecting commands read this file to get the number.6 An exception to this is the messaging commands called by tasks. Running tasks know the port number from the execution environment provided by the suite (via the task job script).
So, to connect to a suite running on another account you must install the suite passphrase (Section 12.5.1), and configure passwordless ssh so that the port number can be retrieved from the remote port file. Then use the --user and --host command options to connect:
If you know the port number of the target suite, give it on the command line to prevent the port-retrieving ssh connection being attempted:
Possession of a suite passphrase gives full control over the suite, and ssh access to the port file also implies full access to the suite host account, so it is recommended that this only be used to interact with your own suites running on other hosts. We plan to implement finer-grained authentication in the future to allow suite owners to grant read-only access to others.
Cylc now handles task job submission in a dedicated worker thread so that submission of many remote tasks at once does not impact cylc’s performance or responsiveness.
Further, for maximum efficiency, job submissions are batched inside the worker thread: batch members are submitted in parallel, and all members must complete (the job submission process, that is, not the submitted task) before the next batch is handled. There is a configurable delay between batches to avoid swamping the host system in the event that hundreds of tasks become ready at the same time:
Here a 120 task ensemble, for example, would be submitted in two batches of 50 followed by one of 20, with a 10 second delay between batches.
A connection timeout can be set in site and user config files (see Section 6) so that messaging commands cannot hang indefinitely if the suite is not responding (thie can be caused by suspending a suite with Ctrl-Z) thereby preventing the task from completing. The same can be done on the command line for other suite-connecting user commands, with the --pyro-timeout option.
Some cylc suites have the potential to generate too much activity at once by virtue of the fact that each task cycles independently constrained only by dependence on other tasks or by clock triggers. Quick-running tasks at the top of the dependency tree with no prerequisites and no clock-triggers (or when running far behind the clock) will spawn rapidly into the future if not constrained somehow. There are two issues to be aware of here: over-burdening task host resources by submitting too many tasks at once, and over-burdening cylc itself by letting the task pool become too big (when fast tasks spawn ahead of the pack cylc has to keep them around in the succeeded state until other tasks, which may depend on them, have caught up).
The runahead limit prevents the fastest tasks in a suite from getting too far ahead of the slowest ones. Cylc’s cycle-interleaving abilities make for generally efficient scheduling, but there is no great advantage in letting a few fast data retrieval tasks, say, get far ahead of the slower tasks because it is typically the tasks at the bottom of the dependency tree, which necessarily run last, that generate the final products.
A cycling task spawns its successor when it enters the submitted state or, for sequential tasks, when it finishes. If a newly spawned task’s cycle time is ahead of the oldest non-finished (succeeded or failed) task by more than the runahead limit it is put into the special runahead held state until other tasks catch up sufficiently; i.e. the runahead limit constrains the number of cycles that can run at once.
The default runahead limit is normally set to twice the minimum cycling interval in the suite. For a suite with 1- and 24-hourly cylcing tasks the default limit will be 2 hours, so that two of the hourly cycles can run at once in between the 24-hourly cycles. If there are any future triggers present (graph = "foo[T+24] => bar") that extend beyond the default limit, it is adjusted up to equal the future offset plus one minimum cycling interval.
A manually set runahead limit should not stall the suite even if set to less than the minimum cycling interval, unless it does not extend out past any future triggers (see Section 9.3.4.10.
Succeeded and failed tasks are ignored when applying the runahead limit (but tasks that can’t run because they depend on a failed task are not ignored, of course).
Large suites could potentially swamp the task host hardware or external batch queueing system, depending on the chosen job submission method, by submitting too many tasks at once. Cylc’s internal queues prevent this by limiting the number of tasks, within defined groups, that are active (submitted or running) at once.
A queue is defined by a name; a limit, which is the maximum number of active tasks allowed for the queue; and a list of member tasks, which are assigned by name to the queue.
Queue configuration is done under the [scheduling] section of the suite.rc file, not as part of the runtime namespace hierarchy, because like dependencies queues constrain when a task runs rather than what runs after it is submitted. When runtime family relationships and queues do coincide you can assign task family members en masse to queues by using the family name, as shown in the example suite listing below.
By default every task is assigned to a default queue, which by default has a zero limit (interpreted by cylc as no limit). To use a single queue for the whole suite just set the default queue limit:
To use other queues just name each one, set the limit, and assign member tasks:
Any tasks not assigned to a particular queue will remain in the default queue. The queues example suite illustrates how queues work by running two task trees side by side (as seen in the graph GUI) each limited to 2 and 3 tasks respectively:
Note assignment of runtime task family members to queues using the family name.
See also Section A.4.1.9 in the Suite.rc Reference.
Tasks can be configured with a list of “retry delay” periods, in minutes, such that if a task fails it will go into a temporary retrying state and then automatically resubmit itself after the next specified delay period expires. A usage example is shown in the suite listed below under Suite And Task Event Handling, Section 12.13.
See also Sections A.2.8 and A.4.1.20 in the Suite.rc Reference.
Cylc can call nominated event handlers when certain suite or task events occur. This is intended to facilitate centralized alerting and automated handling of critical events. Event handlers can send an email or an SMS, call a pager, and so on; or intervene in the operation of their own suite using cylc commands. cylc [hook]email-suite and cylc [hook]email-task are example event handlers packaged with cylc.
Event handlers can be located in the suite bin directory, otherwise it is up to you to ensure their location is in $PATH (in the shell in which cylc runs, on the suite host).
Task event handlers are passed the following arguments by cylc:
where EVENT is one of the following:
MESSAGE, if provided, describes what has happened, and TASKID identifies the task (NAME.CYCLE for cycling tasks).
The retry event occurs if a task fails and has any remaining retries configured (see Section 12.12). The event handler will be called as soon as the task fails, not after the retry delay period when it is resubmitted.
Note that event handlers are called by cylc itself, not by the running tasks so if you wish to pass them additional information via the environment you must use [cylc] → [[environment]], not task runtime environments.
Here is an example suite that tests the retry and failed events. The handler in this case simply echoes its command line arguments to suite stdout.
The cylc reload command reloads the suite definition at run time. This allows: (a) changing task config such as command scripting or environment; (b) adding tasks to, or removing them from, the suite definition, at run time - without shutting the suite down and restarting it. (It is easy to shut down and restart cylc suites, but reloading may be useful if you don’t want to wait for long-running tasks to finish first).
Note that defined tasks can be already be added to or removed from a running suite with the ’cylc insert’ and ’cylc remove’ commands; the reload command allows addition and removal of task definitions. If a new task is definition is added (and used in the graph) you will still need to manually insert an instance of it (with a particular cycle time) into the running suite. If a task definition (and its use in the graph) is deleted, existing task proxies of the of the deleted type will run their course after the reload but new instances will not be spawned. Changes to a task definition will only take effect when the next task instance is spawned (existing instances will not be affected).
Some HPC facilities allow job preemption: the resource manager can kill or suspend running low priority jobs in order to make way for high priority jobs. The preempted jobs may then be automatically restarted by the resource manager, from the same point (if suspended) or requeued to run again from the start (if killed). If a running cylc task gets suspended or hard-killed (kill -9 <PID> is not a trappable signal so cylc cannot detect task failure in this case) and then later restarted, it will just appear to cylc as if it takes longer than normal to run. If the job is soft-killed the signal will be trapped by the task job script and a failure message sent, resulting in cylc putting the task into the failed state. When the preempted task restarts and sends its started message cylc would normally treat this as an error condition (a dead task is not supposed to be sending messages) - a warning will be logged and the task will remain in the failed state. However, if you know that preemption is possible on your system you can tell cylc that affected tasks should be resurrected from the dead, to carry on as normal if progress messages start coming in again after a failure:
To test this in any suite, manually kill a running task then, after cylc registers the task failed, resubmit the killed job manually by cutting-and-pasting the original job submission command from the suite stdout stream.
The cylc broadcast command overrides [runtime] settings in a running suite. This can be used to communicate information to downstream tasks by broadcasting environment variables (communication of information from one task to another normally takes place via the filesystem, i.e. the input/output file relationships embodied in inter-task dependencies). Variables (and any other runtime settings) may be broadcast to all subsequent tasks, or targetted specifically at a specific task, all subsequent tasks with a given name, or all tasks with a given cycle time; see broadcast command help for details.
Broadcast settings targetted at a specific task ID or cycle time expire and are forgotten as the suite moves on. Untargetted variables and those targetted at a task name persist throughout the suite run, even across restarts, unless manually cleared using the broadcast command - and so should be used sparingly.
When a suite is started with the cylc run command (cold or warm start) the cycle time at which it starts can be given on the command line or hardwired into the suite.rc file:
or,
An initial cycle time given on the command line will override one in the suite.rc file.
In the case of cold starts only the initial cycle time will also be passed through to task execution environments as $CYLC_SUITE_INITIAL_CYCLE_TIME. The intended use of this variable is to allow tasks to determine whether they are running in the initial cold-start cycle (when different behaviour may be required) or in a normal mid-run cycle. This is not done for warm starts because a warm start is really an implicit restart - it does not reference a particular previous suite state but it does assume that a previous cycle (for each task) has been run and completed entirely. It follows that in a warm start tasks are really in a normal mid-run cycle, and because no actual previous state is referenced $CYLC_SUITE_INITIAL_CYCLE_TIME gets the value None. After a cold-start, however, the value of the environment variable does persist across restarts because the original cold-start cycle time is stored in suite state dump files.
Since cylc-4.6.0 any cylc suite can run in live, simulation, or dummy mode. Prior to that release simulation mode was a hybrid mode that replaced real tasks with local dummy tasks. This allowed local simulation testing of any suite, to get the scheduling right without running real tasks, but running dummy tasks locally does not add much value over a pure simulation (in which no tasks are submitted at all) because all job submission configuration has to be ignored and most task job script sections have to be cut out to avoid any code that could potentially be specific to the intended task host. So at 4.6.0 we replaced this with a pure simulation mode (task proxies go through the running state automatically within cylc, and no dummy tasks are submitted to run) and a new dummy mode in which only the real task command scripting is dummied out - each dummy task is submitted exactly as the task it represents on the correct host and in the same execution environment. A successful dummy run confirms not only that the scheduling works correctly but also tests real job submission, communication from remote task hosts, and the real task job scripts (in which errors such as use of undefined variables will cause a task to fail).
The run mode, which defaults to live, is set on the command line (for run and restart):
but you can configure the suite to force a particular run mode,
This can be used, for example, for demo suites that necessarily run out of their original context; or to temporarily prevent accidental execution of expensive real tasks during suite development.
Dummy mode task command scripting just prints a message and sleeps for ten seconds by default, but you can override this behaviour for particular tasks or task groups if you like. Here’s how to make a task sleep for twenty seconds and then fail in dummy mode:
Finally, in simulation mode each task takes between 1 and 15 seconds to “run” by default, but you can also alter this for particular tasks or groups of tasks:
Note that to get a failed simulation or dummy mode task to succeed on re-triggering, just change the suite.rc file appropriately and reload the suite definition at run time with cylc reload SUITE before re-triggering the task.
Dummy mode is equivalent to commenting out each task’s command scripting to expose the default scripting.
In simulation and dummy mode cylc uses an accelerated clock with configurable rate and offset relative to the suite’s initial cycle time. This affects the trigger time of any clock-triggered tasks in the suite, and the length of time between cycles if simulating “caught up” operation (without this a six-hour cycling suite, for instance, would wait six hours between cycles when simulating caught-up operation, even though the simulated or dummy tasks run very quickly). By configuring the initial clock offset you can quickly simulate how suites catch up and transition from delayed to real time operation.
See Section A.2.11 for accelerated clock configuration settings.
The run mode is recorded in the suite state dump file. Cylc will not let you restart a non-live mode suite in live mode, or vice versa - any attempt to do the former would certainly be a mistake (because the simulation mode dummy tasks do not generate any of the real outputs depended on by downstream live tasks), and the latter, while feasible, would corrupt the live state dump by turning it over to simulation mode. The easiest way to test a live suite in simulation mode, if you don’t want to obliterate the current state dump by doing a cold or warm start (as opposed to a restart from the previous state) is to take a quick copy of the suite and run the copy in simulation mode. However, if you really want to run a live suite forward in simulation mode without copying it, do this:
Reference tests are finite-duration suite runs that abort with non-zero exit status if any of the following conditions occur (by default):
The default shutdown event handler for reference tests is cylc hook check-triggering which compares task triggering information (what triggers off what at run time) in the test run suite log to that from an earlier reference run, disregarding the timing and order of events - which can vary according to the external queueing conditions, runahead limit, and so on.
To prepare a reference log for a suite, run it with the --reference-log option, and manually verify the correctness of the reference run.
To reference test a suite, just run it (in dummy mode for the most comprehensive test without running real tasks) with the --reference-test option.
A battery of reference tests is used to automatically test cylc before posting a new release version. Reference tests can also be used at cylc upgrade time to check that the upgrade will not break your own complex suites - the triggering check will catch any bug that causes a task to run when it shouldn’t, for instance; even in a dummy mode reference test the full task job script (sans real command scripting) has to execute successfully on the proper task host by the proper job submission method.
Reference tests can be configured with the following settings:
If the default reference test is not sufficient for your needs, firstly note that you can override the default shutdown event handler, and secondly that the --reference-test option is merely a short cut to the following suite.rc settings which can also be set manually if you wish:
The cylc suite-state command, which interrogates suite run databases, has a polling mode that waits on a given task achieving a given state. See cylc suite-state --help for command options and defaults.
The suite graph notation also allows you to define local tasks that, in effect, represent tasks in other suites by automatically polling for them using the cylc suite-state command. Here’s how to trigger a task bar off a task foo in another suite called other.suite:
Local task FOO will poll for the success of foo in suite other.suite at the same cycle time. Other task states can be polled like this,
Default polling parameters (the maximum number of polls and the interval between them) are printed by cylc suite-state --help. These can be configured if necessary under the local polling task runtime section:
The remote suite does not have to be running when polling commences (or at all if the remote condition has already been achieved) because the command interrogates the suite run database, not the suite server process.
For suites owned by others or those with run databases in non-standard locations use the --run-dir option or, in-suite,
To trigger off remote tasks with different cycle times just arrange for the local polling task to be on the same cycling sequence as the remote task that it represents. For instance, if local task cat cycles 6-hourly at 0,6,12,18 but needs to trigger off a remote task dog with cycle times of 3,9,15,21 hours,
This results in DOG having cycle times of 3,9,15,21 - the sames as dog in other.suite.
The following topics have yet to be documented in detail.
Small groups of cylc users can of course share suites by manual copying, and generic revision control tools can be used on cylc suites as for any collection of files. Beyond this cylc does not have a built-in solution for suite storage and discovery, revision control, and deployment, on a network. That is not cylc’s core purpose, and large sites may have preferred revision control systems and suite meta-data requirements that are difficult to anticipate. We can, however, recommend the use of Rose to do all of this very easily and elegantly with cylc suites.
Rose is a framework for managing and running suites of scientific applications, developed at the UK Met Office for use with cylc. It is available under the open source GPL license.
A suite can contain a small number of large, internally complex tasks; a large number of small, simple tasks; or anything in between. Cylc can easily handle a large number of tasks, however, so there are definite advantages to fine-graining:
It should be possible to rerun a task by simply resubmitting it for the same cycle time. In other words, failure at any point during execution of a task should not render a rerun impossible by corrupting the state of some internal-use file, or whatever. It is difficult to overstate the usefulness of being able to rerun the same task multiple times, either outside of the suite with cylc submit, or by retriggering it within the running suite, when debugging a problem.
If a warm-cycled model simply overwrites its restart files in each run, the only cycle that can subsequently run is the next one. This is dangerous because if, accidentally or otherwise, the task runs for the wrong cycle time, its restart files will be corrupted such that the correct cycle can no longer run (probably necessitating a cold-start). Instead, consider organising restart files by cycle time, through a file or directory naming convention, and keep them in a simple rolling archive (cylc’s filename templating and housekeeping utilities can easily do this for you). Then, given availability of external inputs, you can easily rerun the task for any cycle still in the restart archive.
Cylc does not require that successive instances of the same task run sequentially. In order to task advantage of this and achieve maximum functional parallelism whenever the opportunity arises (usually when catching up from a delay) you should ensure that tasks that in principle do not depend on their own previous instances (the vast majority of tasks in most suites, in fact) do not do so in practice. In other words, they should be able to run as soon as their prerequisites are satisfied regardless of whether or not their predecessors have finished yet. This generally just means ensuring that all file I/O contains the generating task’s cycle time in the file or directory name so that there is no interference between successive instances. If this is difficult to achieve in particular cases, however, you can declare the offending tasks to be sequential.
Having all filenames, or perhaps the names of their containing directories, stamped with the cycle time of the generating task greatly aids in managing suite disk usage, both for archiving and cleanup. It also enables the aforementioned task rerunnability recommendation by avoiding overwrite of important files from one cycle to the next. Cylc has powerful utilities for cycle time offset filename templating and housekeeping.
The command line utility program cylc [util] cycletime computes offsets (in hours, days, months, and years) from a given or current (in the environment) cycle time, and optionally inserts the resulting computed cycle time, or components of it, into a given template string containing “YYYY” as a placeholder for the year value, “MM” for month, and so on. This can be used in the suite.rc environment or command scripting sections, or in task implementation scripting, to generate filenames containing the current cycle time (or some offset from it) for use by tasks.
See cylc [util] cycletime --help for examples.
Dependencies between tasks usually, though not always, take the form of files generated by one task that are used by other tasks. It is possible to manage these files across a suite without hard wiring I/O locations and therefore compromising suite flexibility and portability.
For small suites you may be able to have all tasks read and write from a common workspace, thereby avoiding the need to move common files around. You should be able to define the workspace location once in the suite.rc file rather than hard wiring it into the task implementations.
Tasks can be added to a suite to move files from A’s output directory to B’s input directory, and so on. Many connector tasks may be able to call the same file transfer script or command, with task-dependendent input parameters defined in the suite.rc file.
Whether or not your suite uses a single common workspace, passing common I/O paths to tasks via variables defined once in the suite.rc file should allow you to avoid using connector tasks at all, except where it is necessary to transfer files between machines, or similar.
If your suite contains multiple logically distinct tasks that actually have similar functionality (e.g. for moving files around, or for generating similar products from the output of several similar models) have the corresponding cylc tasks all call the same command, script, or executable - just provide different input parameters via the task command scripting and/or execution environment, in the suite.rc file.
If every task in a suite is configured to put its output under $HOME (i.e. the environment variable, literally, not the explicit path to your home directory; and similarly for temporary directories, etc.) then other users will be able to copy the suite and run it immediately, after merely ensuring that any external input files are in the right place.
For the ultimate in portability, construct suites in which all task I/O paths are dynamically configured to be user and suite (registration) specific, e.g.
(these variables are automatically exported to the task execution environment by cylc - see Task Execution Environment, Section 9.4.7). Then you can run multiple instances of the suite at once (even under the same user account) without changing anything, and they will not interfere with each other.
You can test changes to a portable suite safely by making a quick copy of it in a temporary directory, then modifying and running the test copy without fear of corrupting the output directories, suite logs, and suite state, of the original.
Where possible, no task should rely on the action of another task, except for the prerequisites embodied in the suite dependency graph that it has no choice but to depend on. If this rule is followed, your suite will be as flexible as possible in terms of being able to run single tasks, or subsets of the suite, whilst debugging or developing new features.7 For example, every task should create its own output directories if they do not already exist, instead of assuming their existence due to the action of some other task; then you will be able to run single tasks without having to manually create output directories first.
The only compulsory content of a cylc suite definition directory is the suite.rc file. However, you can store whatever you like in a suite definition directory;8 other files there will be ignored by cylc but suite tasks can access them via the $CYLC_SUITE_DEF_PATH variable that cylc automatically exports into the task execution environment. Disk space is cheap - if all programs, ancillary files, control files (etc.) required by the suite are stored in the suite definition directory instead of having the suite reference external build directories (etc.), you can turn the directory into a revision control repository and be virtually assured of the ability to exactly reproduce earlier versions, regardless of suite complexity.
Correct scheduling is not equivalent to “orderly generation of products by cycle time”. Under cylc, a product generation task will trigger as soon as its prerequisites are satisfied (i.e. when its input files are ready, generally) regardless of whether other tasks with the same cycle time have finished or have yet to run. If your product delivery or presentation system demands that all products for one cycle time are uploaded (or whatever) before any from the next cycle, then be aware that this may be quite inefficient if your suite is ever faced with catching up from a significant delay or running over historical data.
If you must, however, you can introduce artificial dependencies into your suite to ensure that the final products never arrive out of sequence. One way of doing this would be to have a final “product upload” task that depends on completion of all the real product generation tasks at the same cycle time, and then declare it to be sequential.
All tasks in a cylc suite know their own private cycle time, but most don’t care about the wall clock time - they just run when their prerequisites are satisfied. The exception to this is clock-triggered tasks, which wait on a wall clock time expressed as an offset from their own cycle time, in addition to any other prerequisites. The usual purpose of these tasks is to retrieve real time data from the external world, triggering at roughly the expected time of availability of the data. Triggering the task at the right time is up to cylc, but the task itself should go into a check-and-wait loop in case the data is delayed; only on successful detection or retrieval should the task report success and then exit (or perhaps report failure and then exit if the data has not arrived by some cutoff time).
Cylc suites, without modification, can handle real time and delayed operation equally well.
In real time operation clock-triggered tasks constrain the behaviour of the whole suite, or at least of all tasks downstream of them in the dependency graph.
In delayed operation (whether due to an actual delay in an operational suite or because you’re running an historical trial) clock-triggered tasks will not constrain the suite at all, and cylc’s cycle interleaving abilities come to the fore, because their trigger times have already passed. But if a clock-triggered task happens to catch up to the wall clock, it will automatically wait again. In this way a cylc suite naturally and seamlessly transitions between delayed and real time operation as required.
Properties shared by multiple tasks (job submission settings, environment variables, command scripting, etc.) should ideally be defined only once. Cylc supports several ways of achieving this:
Multiple inheritance is very efficient when tasks share many properties. Jinja variables are more efficient when single items are shared by just a few tasks that don’t have anything else in common (e.g. an environment variable for the location of a shared file).
For environment variables in particular it may be tempting to define all variables for all tasks once under [root], but this is somewhat analagous to overuse of global variables in programming and it can make it difficult to determine which variables matter to which tasks. Environment filters (Section A.4.1.22) can be used to make this safer, however.
Finally, Jinja2 can also be used to avoid defining intermediate environment variables for the sole purpose of deriving other environment variables at task run time. Instead of this:
do this:
If the values of these Jinja2 variables are needed in external scripts, just translate them directly in environment sections:
If you find yourself writing runtime scripting to get a task to change its behaviour significantly from one cycle to the next, consider that the graph is usually the proper place to express this sort of thing. Use different task names, but have them inherit common properties from a family namespace to avoid duplication. Instead of this:
do this:
Effective visualization can make complex suites easier to understand. Collapsible task families for visualization are defined by the first parents in the runtime namespace hierarchy. Tasks should generally be grouped into visualization families that reflect their purpose within the structure of the suite rather than technical detail such as common job submission method or task host. This often coincides nicely with common configuration inheritance requirements, but if it doesn’t you can use an empty namespace as a first parent for visualization:
and you can demote parents from primary to secondary:
Good style is arguably just a matter of taste. That said, for collaborative development of complex systems it is important to settle on a clear and consistent style. You may find the following suggestions useful.
The suite.rc file format consists of item = value pairs under nested section headings. Clear indentation is the best way to show local nesting level inside large blocks.
Don’t align item = value pairs on the = character:
This does not show nesting level clearly, and long items push everything off to the right. The following may be acceptable though as it preserves proper indentation:
is preferred over this (or similar):
The extra whitespace here translates directly to spurious indentation in the task job script. As it happens this is just an aesthetic problem in bash scripts, but for Python job scripts (which cylc may support in the future) it would be a technical error.
The graph, of course, can also be split up without line breaks:
This appendix defines all legal suite definition config items. Embedded Jinja2 code (see Section 9.6) must process to a valid raw suite.rc file. See also Section 9.2 for a descriptive overview of suite.rc files, including syntax (Section 9.2.1).
The only top level configuration items at present are the suite title and description.
A single line description of the suite. It is displayed in the db viewer window and can be retrieved at run time with the cylc show command.
A multi-line description of the suite. It can be retrieved by the db viewer right-click menu, or at run time with the cylc show command.
]
This section is for configuration that is not specifically task-related.
If this item is set cylc will abort if the suite is not started in the specified mode. This can be used for demo suites that have to be run in simulation mode, for example, because they have been taken out of their normal operational context; or to prevent accidental submission of expensive real tasks during suite development.
Cylc runs off the suite host’s system clock by default. This item allows you to run the suite in UTC even if the system clock is set to local time. Clock-triggered tasks will trigger when the current UTC time is equal to their cycle time plus offset; other time values used, reported, or logged by cylc will also be in UTC.
Cylc does not normally abort if tasks fail, but if this item is turned on it will abort with exit status 1 if any task fails.
If this is turned on cylc will write the resolved dependencies of each task to the suite log as it becomes ready to run (a list of the IDs of the tasks that actually satisfied its prerequisites at run time). Mainly used for cylc testing and development.
Tasks ready to submit are now queued for processing in a background worker thread, so submitting a lot of tasks at once does not hold cylc back. In the job submission thread tasks are are batched, with members of each batch being submitted in parallel. Batches are processed serially, with a delay between batches, to avoid swamping the host system with too many simultaneous job submissions.
The time required for a single task’s job submission to complete typically depends on whether it is a remote task (for which an ssh connection must be established and used) and whether dynamic host selection is used (see A.4.1.19.1 (a dynamic host selection command runs as part of the job submission command). The time taken for a batch of parallel job submissions to complete will be roughly the duration of the slowest member process.
[cylc] → [[job submission]] → batch size The maximum number of tasks to be submitted in a single batch, in the job submission thread. Cylc waits for all batch member job-submissions to complete before proceeding to the next batch.
[cylc] → [[job submission]] → delay between batches It may cause a problem for some batch queue schedulers to submit too many jobs at once, so cylc allows a configurable delay between job submission batches.
Task poll and kill commands are queued to a worker thread that processes them in parallel, in batches to limit the number that can execute at once.
[cylc] → [[poll and kill command submission]] → batch size The maximum number of poll and kill commands to execute at once, before moving on to the next batch.
[cylc] → [[poll and kill command submission]] → delay between batches How long to wait, in seconds, before processing the next batch of poll and kill commands.
Task event handlers are queued to a worker thread that processes them in parallel, in batches to limit the number that can execute at once (suite event handlers, on the other hand, are executed as background sub-processes in the main thread, not queued to the task event handler thread).
[cylc] → [[event handler submission]] → batch size The maximum number of event handlers to execute at once, before moving on to the next batch.
[cylc] → [[event handler submission]] → delay between batches How long to wait, in seconds, before processing the next batch of event handlers.
Cylc has internal “hooks” to which you can attach handlers that are called by cylc whenever certain events occur. This section configures suite event hooks; see Section A.4.1.20 for task event hooks.
Event handlers can send an email or an SMS, call a pager, intervene in the operation of their own suite, or whatever. They can be held in the suite bin directory, otherwise it is up to you to ensure their location is in $PATH (in the shell in which cylc runs, on the suite host). cylc [hook] email-suite is a simple suite event handler.
Suite event handlers are called by cylc with the following arguments:
where,
Additional information can be passed to event handlers via [cylc] → [[environment]].
[cylc] → [[event hooks]] → EVENT handler Specify a handler script to call when one of the following EVENTs occurs:
Item details:
[cylc] → [[event hooks]] → timeout If a timeout is set and the timeout event is handled, the timeout event handler will be called if the suite times out before it finishes. The timer is set initially at suite start up.
[cylc] → [[event hooks]] → reset timer If True (the default) the suite timer will continually reset after any task changes state, so you can time out after some interval since the last activity occured rather than on absolute suite execution time.
[cylc] → [[event hooks]] → abort on timeout If a suite timer is set (above) this will cause the suite to abort with error status if the suite times out while still running.
[cylc] → [[event hooks]] → abort if EVENT handler fails Cylc does not normally care whether an event handler succeeds or fails, but if this is turned on the EVENT handler will be executed in the foreground (which will block the suite while it is running) and the suite will abort if the handler fails.
The cylc lockserver brokers suite and task locks on the network (these are somewhat analagous to traditional local lock files). It prevents multiple instances of a suite or task from being invoked at the same time (via scheduler instances or cylc submit).
See cylc lockserver --help for how to run the lockserver, and cylc lockclient --help for occasional manual lock management requirements.
[cylc] → [[lockserver]] → enable The lockserver is currently disabled by default. It is intended mainly for operational use.
[cylc] → [[lockserver]] → simultaneous instances By default the lockserver prevents multiple simultaneous instances of a suite from running even under different registered names. But allowing this may be desirable if the I/O paths of every task in the suite are dynamically configured to be suite specific (and similarly for the suite state dump and logging directories, by using suite identity variables in their directory paths). Note that the lockserver cannot protect you from running multiple distinct copies of a suite simultaneously.
Variables defined here are exported into the environment in which cylc itself runs. They are then available to local processes spawned directly by cylc. Any variables read by task event handlers must be defined here, for instance, because event handlers are executed directly by cylc, not by running tasks. And similarly the command lines issued by cylc to invoke event handlers or to submit task job scripts could, in principle, make use of environment variables defined here.
[cylc] → [[environment]] → __VARIABLE__ Replace __VARIABLE__ with any number of environment variable assignment expressions. Values may refer to other local environment variables (order of definition is preserved) and are not evaluated or manipulated by cylc, so any variable assignment expression that is legal in the shell in which cylc is running can be used (but see the warning above on variable expansions, which will not be evaluated). White space around the ‘=’ is allowed (as far as cylc’s suite.rc parser is concerned these are normal configuration items).
Accelerated clock settings, used to speed up the wait between cycles in the simulation and dummy run modes.
[cylc] → [[accelerated clock]] → disable Disabling the accelerated clock makes the suite (and its log time stamps etc.) run on real time. Note that if the suite has clock-triggered tasks that catch up to the wall clock, the interval between cycles will also be in real time - e.g. six hours for a six hourly cycle.
[cylc] → [[accelerated clock]] → rate The rate at which the accelerated clock runs in real seconds per simulated hour.
[cylc] → [[accelerated clock]] → offset The clock offset determines the initial time on the accelerated clock, at suite startup, relative to the initial cycle time. An offset of 0 simulates real time operation; greater offsets simulate catch up from a delay and subsequent transition to real time operation.
Reference tests are finite-duration suite runs that abort with non-zero exit status if cylc fails, if any task fails, if the suite times out, or if a shutdown event handler that (by default) compares the test run with a reference run reports failure. See Automated Reference Test Suites, Section 12.19.
[cylc] → [[reference test]] → suite shutdown event handler A shutdown event handler that should compare the test run with the reference run, exiting with zero exit status only if the test run verifies.
As for any event handler, the full path can be ommited if the script is located somewhere in $PATH or in the suite bin directory.
[cylc] → [[reference test]] → required run mode If your reference test is only valid for a particular run mode, this setting will cause cylc to abort if a reference test is attempted in another run mode.
[cylc] → [[reference test]] → allow task failures A reference test run will abort immediately if any task fails, unless this item is set, or a list of expected task failures is provided (below).
[cylc] → [[reference test]] → expected task failures A reference test run will abort immediately if any task fails, unless allow task failures is set (above) or the failed task is found in a list IDs of tasks that are expected to fail.
[cylc] → [[reference test]] → live mode suite timeout The timeout value in minutes after which the test run should be aborted if it has not finished, in live mode. Test runs cannot be done in live mode unless you define a value for this item, because it is not possible to arrive at a sensible default for all suites.
[cylc] → [[reference test]] → simulation mode suite timeout The timeout value in minutes after which the test run should be aborted if it has not finished, in simulation mode. Test runs cannot be done in simulation mode unless you define a value for this item, because it is not possible to arrive at a sensible default for all suites.
[cylc] → [[reference test]] → dummy mode suite timeout The timeout value in minutes after which the test run should be aborted if it has not finished, in dummy mode. Test runs cannot be done in dummy mode unless you define a value for this item, because it is not possible to arrive at a sensible default for all suites.
]
This section allows cylc to determine when tasks are ready to run.
At startup each cycling task (unless specifically excluded under [special tasks]) will be inserted into the suite with this cycle time, or with the closest subsequent valid cycle time for the task. Note that whether or not cold-start tasks, specified under [special tasks], are inserted, and in what state they are inserted, depends on the start up method - cold, warm, or raw. If this item is provided you can override it on the command line or in the gcylc suite start panel.
Cycling tasks are held once they pass the final cycle time, if one is specified. Once all tasks have achieved this state the suite will shut down. If this item is provided you can override it on the command line or in the gcylc suite start panel.
The suite runahead limit prevents the fastest tasks in a suite from getting too far ahead of the slowest ones, as documented in Section 12.11.1. Tasks exceeding the limit are put into a special runahead held state until slower tasks have caught up sufficiently.
Configuration of internal queues, by which the number of simultaneously active tasks (submitted or running) can be limited, per queue. By default a single queue called default is defined, with all tasks assigned to it and no limit. To use a single queue for the whole suite just set the limit on the default queue as required. See also Section 12.11.2.
[scheduling] → [[queues]] → [[[__QUEUE__]]] Section heading for configuration of a single queue. Replace __QUEUE__ with a queue name, and repeat the section as required.
[scheduling] → [[queues]] → [[[__QUEUE__]]] → limit The maximum number of active tasks allowed at any one time, for this queue.
[scheduling] → [[queues]] → [[[__QUEUE__]]] → members A list of member tasks, or task family names, to assign to this queue (assigned tasks will automatically be removed from the default queue).
This section is used to identify any tasks with several kinds of special behaviour. By default (i.e. non “special” behaviour) tasks submit (or queue) as soon as their prerequisites are satisfied, and they spawn a successor as soon as they enter the submitted state.9 Family names used here are interpreted purely as shorthand for the list of all member tasks. A sequential family, therefore, is a family of sequential tasks, not a family that behaves “sequentially” as a whole.
[scheduling] → [[special tasks]] → clock-triggered Clock-triggered tasks wait on a wall clock time specified as an offset in hours relative to their own cycle time, in addition to any dependence they have on other tasks. Generally speaking, only tasks that wait on external real time data need to be clock-triggered. Note that in computing the trigger time the full wall clock time and cycle time are compared, not just hours and minutes of the day, so when running a suite in catchup/delayed operation, or over historical periods, clock-triggered tasks will not constrain the suite at all until they catch up to the wall clock.
[scheduling] → [[special tasks]] → start-up Start-up tasks are one-off tasks (they do not spawn a successor) that only run in the first cycle (and only in a cold-start) and any dependence on them is ignored in subsequent cycles. They can be used to prepare a suite workspace, for example, before other tasks run. Start-up tasks cannot appear in conditional trigger expressions with normal cycling tasks, because the meaning of the conditional expression becomes undefined in subsequent cycles.
[scheduling] → [[special tasks]] → cold-start A cold-start task is one-off task used to satisfy the dependence of an associated task with the same cycle time, on outputs from a previous cycle - when those outputs are not available. The primary use for this is to cold-start a warm-cycled forecast model that normally depends on restart files (e.g. model background fields) generated by its previous forecast, when there is no previous forecast. This is required when cold-starting the suite, but cold-start tasks can also be inserted into a running suite to restart a model that has had to skip some cycles after running into problems. Cold-start tasks can invoke real cold-start processes, or they can just be dummy tasks that represent some external process that has to be completed before the suite is started. Unlike start-up tasks, dependence on cold-start tasks is preseverved in subsequent cycles so they must typically be used in OR’d conditional expressions to avoid holding up the suite.
[scheduling] → [[special tasks]] → sequential By default, a task spawns a successor as soon as it is submitted to run so that successive instances of the same task can run in parallel if the opportunity arises (i.e. if their prerequisites happen to be satisfied before their predecessor has finished). Sequential tasks, however, will not spawn a successor until they have finished successfully. This should be used for (a) tasks that cannot run in parallel with their own previous instances because they would somehow interfere with each other (use cycle time in all I/O paths to avoid this); and (b) warm cycled forecast models that write out restart files for multiple cycles ahead (exception: see “explicit restart outputs” below).10
[scheduling] → [[special tasks]] → one-off Synchronous one-off tasks have an associated cycle time but do not spawn a successor. Synchronous start-up and cold-start tasks are automatically one-off tasks and do not need to be listed here. Dependence on one-off tasks is not restricted to the first cycle.
[scheduling] → [[special tasks]] → explicit restart outputs This is only required in the event that you need a warm cycled forecast model to start at the instant its restart files are ready (if other prerequisites are satisfied) even if its previous instance has not finished yet. If so, the model task has to depend on special output messages emitted by the previous instance as soon as its restart files are ready, instead of just on the previous instance finishing. Tasks in this category must define special restart output messages containing the word “restart”, in [runtime] → [[TASK]] → [[[outputs]]] - see Section 10.3.
[scheduling] → [[special tasks]] → exclude at start-up Any task listed here will be excluded from the initial task pool (this goes for suite restarts too). If an inclusion list is also specified, the initial pool will contain only included tasks that have not been excluded. Excluded tasks can still be inserted at run time. Other tasks may still depend on excluded tasks if they have not been removed from the suite dependency graph, in which case some manual triggering, or insertion of excluded tasks, may be required.
[scheduling] → [[special tasks]] → include at start-up If this list is not empty, any task not listed in it will be excluded from the initial task pool (this goes for suite restarts too). If an exclusion list is also specified, the initial pool will contain only included tasks that have not been excluded. Excluded tasks can still be inserted at run time. Other tasks may still depend on excluded tasks if they have not been removed from the suite dependency graph, in which case some manual triggering, or insertion of excluded tasks, may be required.
The suite dependency graph is defined under this section. You can plot the dependency graph as you work on it, with cylc graph or by right clicking on the suite in the db viewer. See also Section 9.3.
[scheduling] → [[dependencies]] → graph The dependency graph for any one-off asynchronous (non-cycling) tasks in the suite goes here. This can be used to construct a suite of one-off tasks (e.g. build jobs and related processing) that just completes and then exits, or an initial suite section that completes prior to the cycling tasks starting (if you make the first cycling tasks depend on the last one-off ones). But note that synchronous start-up tasks can also be used for the latter purpose. See Section A.3.6.2.1 below for graph string syntax, and Section 9.3.
[scheduling] → [[dependencies]] → [[[__VALIDITY__]]] __VALIDITY__ section headings define the sequence of cycle times for which the subsequent graph section is valid. For cycling tasks use a comma-separated list of integer hours, 0 ≤ H ≤ 23 for the original hours-of-the-day cycling, or reference a particular stepped daily, monthly, or yearly cycling module:
For repeating asynchronous tasks put ‘ASYNCID:pattern’ in the section heading, where pattern is a regular expression that matches an asynchronous task ID:
See Section 9.3.3, Graph Types for the meaning of the stepped cycler arguments, how multiple graph sections combine within a single suite, and so on.
[scheduling] → [[dependencies]] → [[[__VALIDITY__]]] → graph The dependency graph for the specified validity section (described just above) goes here. Syntax examples follow; see also Sections 9.3 (Configuring Scheduling) and 9.3.4 (Trigger Types).
[scheduling] → [[dependencies]] → [[[__VALIDITY__]]] → daemon For [[[ASYNCID:pattern]]] validity sections only, list asynchronous daemon tasks by name. This item is located here rather than under [scheduling] → [[special tasks]] because a damon task is associated with a particular asynchronous ID.
]
This section is used to specify how, where, and what to execute when tasks are ready to run. Common configuration can be factored out in a multiple-inheritance hierarchy of runtime namespaces that culminates in the tasks of the suite. Order of precedence is determined by the C3 linearization algorithm as used to find the method resolution order in Python language class hiearchies. For details and examples see Section 9.4, Runtime Properties.
Replace __NAME__ with a namespace name, or a comma separated list of names, and repeat as needed to define all tasks in the suite. Names may contain letters, digits, underscores, and hyphens. A namespace represents a group or family of tasks if other namespaces inherit from it, or a task if no others inherit from it.
If multiple names are listed the subsequent settings apply to each.
All namespaces inherit initially from root, which can be explicitly configured to provide or override default settings for all tasks in the suite.
[runtime] → [[__NAME__]] → inherit A list of the immediate parent(s) this namespace inherits from. If no parents are listed root is assumed.
[runtime] → [[__NAME__]] → title A single line description of this namespace. It is displayed by the cylc list command and can be retrieved from running tasks with the cylc show command.
[runtime] → [[__NAME__]] → description A multi-line description of this namespace, retrievable from running tasks with the cylc show command.
[runtime] → [[__NAME__]] → initial scripting Initial scripting is executed at the top of the task job script just before the cylc task started message call is made, and before the task execution environment is configured - so it does not have access to any suite or task environment variables. The original intention was to allow remote tasks to source login scripts before calling the first cylc command, e.g. to set $PYTHONPATH if Pyro has been installed locally. Note however that the remote task invocation mechanism now automatically sources both /etc/profile and $HOME/.profile if they exist. For other uses pre-command scripting should be used if possible because it can has access to the task execution environment.
[runtime] → [[__NAME__]] → environment scripting Environment scripting is inserted into the task job script between the cylc-defined environment (suite and task identity, etc.) and the user-defined task runtime environment - i.e. it has access to the cylc environment, and the task environment has access to the results of this scripting.
[runtime] → [[__NAME__]] → command scripting The scripting to execute when the associated task is ready to run - this can be a single command or multiple lines of scripting.
[runtime] → [[__NAME__]] → pre-command scripting Scripting to be executed immediately before the command scripting. This would typically be used to add scripting to every task in a family (for individual tasks you could just incorporate the extra commands into the main command scripting). See also post-command scripting, below.
[runtime] → [[__NAME__]] → post-command scripting Scripting to be executed immediately after the command scripting. This would typically be used to add scripting to every task in a family (for individual tasks you could just incorporate the extra commands into the main command scripting). See also pre-command scripting, above.
[runtime] → [[__NAME__]] → retry delays A list of time intervals in minutes, after which to resubmit the task if it fails. The variable $CYLC_TASK_TRY_NUMBER in the task execution environment is incremented each time, starting from 1 for the first try - this can be used to vary task behavior by try number.
[runtime] → [[__NAME__]] → submission polling intervals A list of intervals, in minutes, with optional multipliers, after which cylc will poll for status while the task is in the submitted state.
For the polling task communications method this overrides the default submission polling interval in the site/user config files (Section 6). For pyro and ssh task communications polling is not done by default but it can still be configured here as a regular check on the health of submitted tasks.
Each list value is used in turn until the last, which is used repeatedly until finished.
Detaching tasks cannot be polled or killed by cylc - see Section 10.5.
A single interval value is probably appropriate for submission polling.
[runtime] → [[__NAME__]] → execution polling intervals A list of intervals, in minutes, with optional multipliers, after which cylc will poll for status while the task is in the running state.
For the polling task communications method this overrides the default execution polling interval in the site/user config files (Section 6). For pyro and ssh task communications polling is not done by default but it can still be configured here as a regular check on the health of submitted tasks.
Each list value is used in turn until the last, which is used repeatedly until finished.
Detaching tasks cannot be polled or killed by cylc - see Section 10.5.
[runtime] → [[__NAME__]] → manual completion If a task’s initiating process detaches and exits before task processing is finished then cylc cannot arrange for the task to automatically signal when it has succeeded or failed. In such cases you must use this configuration item to tell cylc not to arrange for automatic completion messaging, and insert some minimal completion messaging yourself in appropriate places in the task implementation (see Section 10.5).
[runtime] → [[__NAME__]] → work sub-directory Task command scripting is executed from with automatically created work directories, which can be accessed by their tasks through $CYLC_TASK_WORK_DIR. This items sets the low-level sub-directory name. The default value provides a unique workspace for each task, but this can overridden to make groups of tasks run in the same working directory, thereby providing a share space for tasks that read and write from their current working directories.
[runtime] → [[__NAME__]] → enable resurrection If a message is received from a failed task cylc will normally treat this as an error condition, issue a warning, and leave the task in the “failed” state. But if “enable resurrection” is switched on failed tasks can come back from the dead: if the same task job script is executed again cylc will put the task back into the running state and continue as normal when the started message is received. This can be used to handle HPC-style job preemption wherein a resource manager may kill a running task and reschedule it to run again later, to make way for a job with higher immediate priority. See also Section 12.15, Handling Job Preemption
[runtime] → [[__NAME__]] → [[[dummy mode]]] Dummy mode configuration.
[runtime] → [[__NAME__]] → [[[dummy mode]]] → command scripting The scripting to execute when the associated task is ready to run, in dummy mode - this can be a single command or a multiple lines of scripting.
[runtime] → [[__NAME__]] → [[[dummy mode]]] → disable pre-command scripting This disables pre-command scripting, is likely to contain code specific to the real task, in dummy mode.
[runtime] → [[__NAME__]] → [[[dummy mode]]] → disable post-command scripting This disables post-command scripting, which is likely to contain code specific to the real task, in dummy mode.
[runtime] → [[__NAME__]] → [[[simulation mode]]] Simulation mode configuration.
[runtime] → [[__NAME__]] → [[[simulation mode]]] → run time range This defines an interval [min,max) (seconds) from within which the the simulation mode task run length will be randomly chosen.
[runtime] → [[__NAME__]] → [[[job submission]]] This section configures the means by which cylc submits task job scripts to run.
[runtime] → [[__NAME__]] → [[[job submission]]] → method See Task Job Submission (Section 11) for how job submission works, and how to define new methods. Cylc has a number of built in job submission methods:
[runtime] → [[__NAME__]] → [[[job submission]]] → command template This allows you to override the actual command used by the chosen job submission method. The template’s first %s will be substituted by the job file path. Where applicable the second and third %s will be substituted by the paths to the job stdout and stderr files.
[runtime] → [[__NAME__]] → [[[job submission]]] → shell This is the shell used to interpret the job script submitted by cylc when a task is ready to run. It has no bearing on the shell used in task implementations. Command scripting and suite environment variable assignment expressions must be valid for this shell. The latter is currently hardwired into cylc as export item=value - valid for both bash and ksh because value is entirely user-defined - but cylc would have to be modified slightly to allow use of the C shell.
[runtime] → [[__NAME__]] → [[[job submission]]] → retry delays A list of time intervals in minutes, after which to resubmit if job submission fails.
[runtime] → [[__NAME__]] → [[[remote]]] Configure host and username, for tasks that do not run on the suite host account. Passwordless ssh is used to submit the task by the configured job submission method, so you must distribute your ssh key to allow this. Cylc must be installed on remote task hosts, but of the external software dependencies only Pyro is required there (not even that if ssh messaging is used; see below).
[runtime] → [[__NAME__]] → [[[remote]]] → host The remote host for this namespace. This can be a static hostname, an environment variable that holds a hostname, or a command that prints a hostname to stdout. Host selection commands are executed just prior to job submission. The host (static or dynamic) may have an entry in the cylc site or user config file to specify parameters such as the location of cylc on the remote machine; if not, the corresponding local settings (on the suite host) will be assumed to apply on the remote host.
[runtime] → [[__NAME__]] → [[[remote]]] → owner The username of the task host account. This is (only) used in the passwordless ssh command invoked by cylc to submit the remote task (consequently it may be defined using local environment variables (i.e. the shell in which cylc runs, and [cylc] → [[environment]]).
If you use dynamic host selection and have different usernames on the different selectable hosts, you can configure your $HOME/.ssh/config to handle username translation.
[runtime] → [[__NAME__]] → [[[remote]]] → suite definition directory The path to the suite definition directory on the remote host, needed if remote tasks require access to files stored there (via $CYLC_SUITE_DEF_PATH) or in the suite bin directory (via $PATH). If this item is not defined, the local suite definition directory path will be assumed, with the suite owner’s home directory, if present, replaced by '$HOME' for interpretation on the remote host.
[runtime] → [[__NAME__]] → [[[event hooks]]] Cylc has internal “hooks” to which you can attach handlers that are called by cylc whenever certain events occur. This section configures task event hooks; see Section A.2.8 for suite event hooks.
Event handlers can send an email or an SMS, call a pager, intervene in the operation of their own suite, or whatever. They can be held in the suite bin directory, otherwise it is up to you to ensure their location is in $PATH (in the shell in which cylc runs, on the suite host). cylc [hook] email-task is a simple task event handler.
Task event handlers are called by cylc with the following arguments:
where,
Additional information can be passed to event handlers via the [cylc] → [[environment]] (but not via task runtime environments - event handlers are not called by tasks).
[runtime] → [[__NAME__]] → [[[event hooks]]] → EVENT handler Specify a handler script to call when one of the following EVENTs occurs:
Item details:
[runtime] → [[__NAME__]] → [[[event hooks]]] → submission timeout If a task has not started the specified number of minutes after it was submitted, the submission timeout event handler will be called.
[runtime] → [[__NAME__]] → [[[event hooks]]] → execution timeout If a task has not finished the specified number of minutes after it started running, the execution timeout event handler will be called.
[runtime] → [[__NAME__]] → [[[event hooks]]] → reset timer If you set an execution timeout the timer can be reset to zero every time a message is received from the running task (which indicates the task is still alive). Otherwise, the task will timeout if it does not finish in the alotted time regardless of incoming messages.
[runtime] → [[__NAME__]] → [[[environment]]] The user defined task execution environment. Variables defined here can refer to cylc suite and task identity variables, which are exported earlier in the task job script, and variable assignment expressions can use cylc utility commands because access to cylc is also configured earlier in the script. See also Task Execution Environment, Section 9.4.7.
[runtime] → [[__NAME__]] → [[[environment]]] → __VARIABLE__ Replace __VARIABLE__ with any number of environment variable assignment expressions. Order of definition is preserved so values can refer to previously defined variables. Values are passed through to the task job script without evaluation or manipulation by cylc, so any variable assignment expression that is legal in the job submission shell can be used. White space around the ‘=’ is allowed (as far as cylc’s suite.rc parser is concerned these are just normal configuration items).
[runtime] → [[__NAME__]] → [[[environment filter]]] This section contains environment variable inclusion and exclusion lists that can be used to filter the inherited environment. This is not intended as an alternative to a well-designed inheritance hierarchy that provides each task with just the variables it needs. Filters can, however, improve suites with tasks that inherit a lot of environment they don’t need, by making it clear which tasks use which variables. They can optionally be used routinely as explicit “task environment interfaces” too, at some cost to brevity, because they guarantee that variables filtered out of the inherited task environment are not used.
Note that environment filtering is done after inheritance is completely worked out, not at each level on the way, so filter lists in higher-level namespaces only have an effect if they are not overridden by descendants.
[runtime] → [[__NAME__]] → [[[environment filter]]] → include If given, only variables named in this list will be included from the inherited environment, others will be filtered out. Variables may also be explicitly excluded by an exclude list.
[runtime] → [[__NAME__]] → [[[environment filter]]] → exclude Variables named in this list will be filtered out of the inherited environment. Variables may also be implicitly excluded by omission from an include list.
[runtime] → [[__NAME__]] → [[[directives]]] Batch queue scheduler directives. Whether or not these are used depends on the job submission method. For the built-in loadleveler, pbs, and sge methods directives are written to the top of the task job script in the correct format for the method. Specifying directives individually like this allows use of default directives that can be individually overridden at lower levels of the runtime namespace hierarchy.
[runtime] → [[__NAME__]] → [[[directives]]] → __DIRECTIVE__ Replace __DIRECTIVE__ with each directive assignment, e.g. class = parallel
Example directives for the built-in job submission methods are shown in Section 11.3.
[runtime] → [[__NAME__]] → [[[outputs]]] This section is only required if other tasks need to trigger off specific internal outputs of this task (as opposed to triggering off it finishing). The task implementation must report the specified output messages by calling cylc task message when the corresponding real outputs have been completed.
[runtime] → [[__NAME__]] → [[[outputs]]] → __OUTPUT__ Replace __OUTPUT__ with any number of labelled output messages.
where the item name must match the output label associated with this task in the suite dependency graph, e.g.:
[runtime] → [[__NAME__]] → [[[suite state polling]]] Configure automatic suite polling tasks as described in Section 12.20. The items in this section reflect the options and defaults of the cylc suite-state command, except that the target suite name and the --task, --cycle, and --status options are taken from the graph notation.
[runtime] → [[__NAME__]] → [[[suite state polling]]] → run-dir For your own suites the run database location is determined by your site/user config. For other suites, e.g. those owned by others, or mirrored suite databases, use this item to specify the location of the top level cylc run directory (the database should be a suite-name sub-directory of this location).
[runtime] → [[__NAME__]] → [[[suite state polling]]] → interval Polling interval.
[runtime] → [[__NAME__]] → [[[suite state polling]]] → max-polls The maximum number of polls before timing out and entering the ‘failed’ state.
[runtime] → [[__NAME__]] → [[[suite state polling]]] → user Username of an account on the suite host to which you have access. The polling cylc suite-state command will be invoked on the remote account.
[runtime] → [[__NAME__]] → [[[suite state polling]]] → host The hostname of the target suite. The polling cylc suite-state command will be invoked on the remote account.
[runtime] → [[__NAME__]] → [[[suite state polling]]] → verbose Run the polling cylc suite-state command in verbose output mode.
]
Configuration of suite graphing and, where applicable, the gcylc graph view. Graphviz documentation of node shapes and so on can be found at http://www.graphviz.org/Documentation.php.
The cycle time from which to start the suite graph.
The cycle time at which to end the suite graph.
A list of family (namespace) names to be shown in the collapsed state (i.e. the family members will be replaced by a single family node) when the suite is first plotted in the graph viewer or the gcylc graph view. If this item is not set, the default is to collapse all families at first. Interactive GUI controls can then be used to group and ungroup family nodes at will.
Graph edges (dependency arrows) can be plotted in the same color as the upstream node (task or family) to make paths through a complex graph easier to follow.
Graph node labels can be printed in the same color as the node outline.
Set the default attributes (color and style etc.) of graph nodes (tasks and families). Attribute pairs must be quoted to hide the internal = character.
Set the default attributes (color and style etc.) of graph edges (dependency arrows). Attribute pairs must be quoted to hide the internal = character.
If True, the gcylc graph-view write out a dot-language graph file on every change; these can be post-processed into a movie showing how the suite evolves. The frames will be written to the run time graph directory (see below).
Define named groups of graph nodes (tasks and families) which can styled en masse, by name, in [visualization] → [[node attributes]]. Node groups are automatically defined for all task families, including root, so you can style family and member nodes at once by family name.
[visualization] → [[node groups]] → __GROUP__ Replace __GROUP__ with each named group of tasks or families.
Here you can assign graph node attributes to specific nodes, or to all members of named groups defined in [visualization] → [[node groups]]. Task families are automatically node groups. Styling of a family node applies to all member nodes (tasks and sub-families), but precedence is determined by ordering in the suite definition. For example, if you style a family red and then one of its members green, cylc will plot a red family with one green member; but if you style one member green and then the family red, the red family styling will override the earlier green styling of the member.
[visualization] → [[node attributes]] → __NAME__ Replace __NAME__ with each node or node group for style attribute assignment.
Cylc can generate graphs of dependencies resolved at run time, i.e. what actually triggers off what as the suite runs. This feature is retained mainly for development and debugging purposes. You can use simulation mode or dummy mode to generate runtime graphs very quickly.
[visualization] → [[runtime graph]] → enable Runtime graphing is disabled by default.
[visualization] → [[runtime graph]] → cutoff New nodes will be added to the runtime graph as the corresponding tasks trigger, until their cycle time exceeds the initial cycle time by more than this cutoff, in hours.
[visualization] → [[runtime graph]] → directory Where to put the runtime graph file, runtime-graph.dot.
See Section 9.7.
Cylc provides, via $CYLC_DIR/conf/suiterc/⋆.spec, sensible default values for many configuration items so that most users will not need to explicitly configure log directories and so on. The defaults are sufficient, in fact, to allow test suites defined by dependency graph alone (command scripting, for example, defaults to printing a simple message, sleeping for a few seconds, and then exiting).
The cylc get-config command parses a suite definition and retrieves configuration values for individual items, sections, or entire suites.
This section defines all legal items and values for cylc site and user config files. See Site And User Config Files (Section 6) for file locations, intended usage, and how to generate the files using the cylc get-global-config command.
As for suite definitions, Jinja2 expressions can be embedded in site and user config files to generate the final result parsed by cylc. Use of Jinja2 in suite definitions is documented in Section 9.6.
A temporary directory is needed by a few cylc commands, and is cleaned automatically on exit. Leave unset for the default (usually $TMPDIR).
A rolling archive of suite state dumps is maintained under the suite run directory, and is used for restarts; this item determines the number of previous states retained. The most recent saved state file is called state. Sucessively older files have increasing integer values appended, starting from 1.
Commands that intervene in running suites can be made to ask for confirmation before acting. Some find this annoying and ineffective as a safety measure, however, so command prompts are disabled by default.
The suite run directory tree is created anew with every suite start (not restart) but output from the most recent previous runs can be retained in a rolling archive. Set length to 0 to keep no backups. This is incompatible with current Rose suite housekeeping (see Section 14 for more on Rose) so it is disabled by default, in which case new suite run files will overwrite existing ones in the same run directory tree. Rarely, this can result in incorrect polling results due to the presence of old task status files.
The number of old run directory trees to retain if run directory housekeeping is enabled.
Cylc can poll running jobs to catch problems that prevent task messages from being sent back to the suite, such as hard job kills, network outages, or unplanned task host shutdown. Routine polling is done only for the polling task communication method (below) unless suite-specific polling is configured in the suite definition. A list of interval values can be specified, with the last value used repeatedly until the task is finished - this allows more frequent polling near the beginning and end of the anticipated task run time. Multipliers can be used as shorthand as in the example below.
Cylc can also poll submitted jobs to catch problems that prevent the submitted job from executing at all, such as deletion from an external batch scheduler queue. Routine polling is done only for the polling task communication method (below) unless suite-specific polling is configured in the suite definition. A list of interval values can be specified as for execution polling (above) but a single value is probably sufficient for job submission polling.
]
This section contains configuration items that affect task-to-suite communications.
If a send fails, the messaging code will retry after a configured delay interval.
If successive sends fail, the messaging code will give up after a configured number of tries.
This is the same as the --pyro-timeout option in cylc commands. Without a timeout Pyro connections to unresponsive suites can hang indefinitely (suites suspended with Ctrl-Z for instance).
]
The suite event log, held under the suite run directory, is maintained as a rolling archive. Logs are rolled over (backed up and started anew) when they reach a configurable limit size.
If true, a new suite log will be started for a new suite run.
How many rolled logs to retain in the archive.
Suite event logs are rolled over when they reach this file size.
]
Documentation locations for the cylc doc command and gcylc Help menus.
File locations of documentation held locally on the cylc host server.
[documentation] → [files] → html index File location of the main cylc documentation index.
[documentation] → [files] → pdf user guide File location of the cylc User Guide, PDF version.
[documentation] → [files] → multi-page html user guide File location of the cylc User Guide, multi-page HTML version.
[documentation] → [files] → single-page html user guide File location of the cylc User Guide, single-page HTML version.
Online documentation URLs.
[documentation] → [urls] → internet homepage URL of the cylc internet homepage, with links to documentation for the latest official release.
[documentation] → [urls] → local index Local intranet URL of the main cylc documentation index.
]
PDF and HTML viewers can be launched by cylc to view the documentation.
Your preferred PDF viewer program.
Your preferred web browser.
]
Choose your favourite text editor for editing suite definitions.
The editor to be invoked by the cylc command line interface.
The editor to be invoked by the cylc GUI.
]
Pyro is the RPC layer used for network communication between cylc clients (suite-connecting commands and guis) servers (running suites). Each suite listens on a dedicated network port, binding on the first available starting at the configured base port.
The first port that cylc is allowed to use.
This determines the maximum number of suites that can run at once on the suite host.
Each suite stores its port number, by suite name, under this directory.
]
The [hosts] section configures some important host-specific settings for the suite host (‘localhost’) and remote task hosts. Note that remote task behaviour is determined by the site/user config on the suite host, not on the task host. Suites can specify task hosts that are not listed here, in which case local settings will be assumed, with the local home directory path, if present, replaced by $HOME in items that configure directory locations.
The default task host is the suite host, localhost, with default values as listed below. Use an explicit [hosts][[localhost]] section if you need to override the defaults. Localhost settings are then also used as defaults for other hosts, with the local home directory path replaced as described above. This applies to items omitted from an explicit host section, and to hosts that are not listed at all in the site and user config files. Explicit host sections are only needed if the automatically modified local defaults are not sufficient.
Host section headings can also be regular expressions to match multiple hostnames. Note that the general regular expression wildcard is ‘.⋆’ (zero or more of any character), not ‘⋆’. Hostname matching regular expressions are used as-is in the Python re.match() function. As such they match from the beginning of the hostname string (as specified in the suite definition) and they do not have to match through to the end of the string (use the string-end matching character ‘$’ in the expression to force this).
A hierachy of host match expressions from specific to general can be used because config items are processed in the order specified in the file.
[hosts] → HOST → run directory The top level of the directory tree that holds suite-specific output logs, state dump files, run database, etc.
[hosts] → HOST → work directory The top level for suite work and share directories.
[hosts] → HOST → task communication method The means by which task progress messages are reported back to the running suite. See above for default polling intervals for the poll method.
[hosts] → HOST → remote shell template A string template, containing %s as a placeholder for the host name, for the command used to invoke commands on this host. This is not used on the suite host unless you run local tasks under another user account.
[hosts] → HOST → use login shell Whether to use a login shell or not for remote command invocation. By default cylc runs remote ssh commands using a login shell,
which will source /etc/profile and ~/.profile to set up the user environment. However, for security reasons some institutions do not allow unattended commands to start login shells, so you can turn off this behaviour to get,
which will use the default shell on the remote machine, sourcing ~/.bashrc (or ~/.cshrc) to set up the environment. In either case $PATH on the remote machine should include $CYLC_DIR/bin in order for the remote cylc program to be found.
NOTE: this setting does not currently apply to job submission commands (which execute on the suite host to submit remote tasks).
]
The suite host’s identity must be determined locally by cylc and passed to running tasks (via $CYLC_SUITE_HOST) so that task messages can target the right suite on the right host.
This item determines how cylc finds the identity of the suite host. For the default name method cylc asks the suite host for its host name. This should resolve on remote task hosts to the IP address of the suite host; if it doesn’t, adjust network settings or use one of the other methods. For the address method, cylc attempts to use a special external “target address” to determine the IP address of the suite host as seen by remote task hosts (in-source documentation in $CYLC_DIR/lib/cylc/suite_host.py explains how this works). And finally, as a last resort, you can choose the hardwired method and manually specify the host name or IP address of the suite host.
This item is required for the address self-identification method. If your suite host sees the internet, a common address such as google.com will do; otherwise choose a host visible on your intranet.
Use this item to explicitly set the name or IP address of the suite host if you have to use the hardwired self-identification method.
]
Utilities such as cylc gsummary need to scan hosts for running suites.
A list of hosts to scan for running suites.
Each cylc user can optionally run his/her own lockserver to prevent accidental invocation of multiple instances of the same suite or task at the same time. The suite and task locks brokered by the lockserver are analogous to traditional lock files, but they work across a network, even for distributed suites containing tasks that start executing on one host and finish on another.
Accidental invocation of multiple instances of the same suite or task at the same time can have serious consequences, so use of the lockserver should be considered for important operational suites, but it may be considered an unnecessary complication for general less critical usage, so it is currently disabled by default.
To enable the lockserver:
The suite will now abort at start-up if it cannot connect to the lockserver. To start your lockserver daemon,
To check that it is running,
For detailed usage information,
There is a command line client interface,
for interrogating the lockserver and managing locks manually (e.g. releasing locks if a suite was killed before it could clean up after itself).
To watch suite locks being acquired and released as a suite runs,
The graph view in the gcylc GUI has the advantage that it shows the structure of a suite very clearly as it evolves. It works remarkably well even for very large suites (up to several hundred tasks or more) but because the graphviz engine does a new global layout every time the graph changes the layout is often not very stable. This may not be a solvable problem even in principle as it seems likely that making continual incremental changes to an existing graph without redoing the global layout would inevitably result in a horrible mess.
The following features of the graph view, however, help mitigate the the jumping layout problem:
Pyro (Python Remote Objects) is a widely used open source objected oriented Remote Procedure Call technology developed by Irmen de Jong.
Earlier versions of cylc used the Pyro Nameserver to marshal communication between client programs (tasks, commands, viewers, etc.) and their target suites. This worked well, but in principle it provided a route for one suite or user on the subnet to bring down all running suites by killing the nameserver. Consequently cylc now uses Pyro simply as a lightweight object oriented wrapper for direct network socket communication between client programs and their target suites - all suites are thus entirely isolated from one another.
Copyright Ⓒ 2007 Free Software Foundation, Inc. http://fsf.org/
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for software and other kinds of works.
The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program–to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too.
When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.
Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it.
For the developers’ and authors’ protection, the GPL clearly explains that there is no warranty for this free software. For both users’ and authors’ sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions.
Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users’ freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and modification follow.
Terms and Conditions
“This License” refers to version 3 of the GNU General Public License.
“Copyright” also means copyright-like laws that apply to other kinds of works, such as semiconductor masks.
“The Program” refers to any copyrightable work licensed under this License. Each licensee is addressed as “you”. “Licensees” and “recipients” may be individuals or organizations.
To “modify” a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a “modified version” of the earlier work or a work “based on” the earlier work.
A “covered work” means either the unmodified Program or a work based on the Program.
To “propagate” a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well.
To “convey” a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays “Appropriate Legal Notices” to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion.
The “source code” for a work means the preferred form of the work for making modifications to it. “Object code” means any non-source form of a work.
A “Standard Interface” means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language.
The “System Libraries” of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A “Major Component”, in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it.
The “Corresponding Source” for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work’s System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work.
The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source.
The Corresponding Source for a work in source code form is that same work.
All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary.
No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures.
When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work’s users, your or third parties’ legal rights to forbid circumvention of technological measures.
You may convey verbatim copies of the Program’s source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee.
You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions:
A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an “aggregate” if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation’s users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate.
You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways:
A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work.
A “User Product” is either (1) a “consumer product”, which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, “normally used” refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product.
“Installation Information” for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made.
If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM).
The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying.
“Additional permissions” are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms:
All other non-permissive additional terms are considered “further restrictions” within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way.
You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11).
However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice.
Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10.
You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so.
Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License.
An “entity transaction” is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party’s predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it.
A “contributor” is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor’s “contributor version”.
A contributor’s “essential patent claims” are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, “control” includes the right to grant patent sublicenses in a manner consistent with the requirements of this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor’s essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version.
In the following three paragraphs, a “patent license” is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To “grant” such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party.
If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. “Knowingly relying” means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient’s use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it.
A patent license is “discriminatory” if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law.
If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program.
Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such.
The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License “or any later version” applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation.
If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy’s public statement of acceptance of a version permanently authorizes you to choose that version for the Program.
Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee.
End of Terms and Conditions
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the “copyright” line and a pointer to where the full notice is found.
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode:
The hypothetical commands show w and show c should show the appropriate parts of the General Public License. Of course, your program’s commands might be different; for a GUI interface, you would use an “about box”.
You should also get your employer (if you work as a programmer) or school, if any, to sign a “copyright disclaimer” for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see http://www.gnu.org/licenses/.
The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read http://www.gnu.org/philosophy/why-not-lgpl.html.
1Future plans for EcoConnect include additional deterministic regional weather forecasts and a statistical ensemble.
2An OR operator on the right doesn’t make much sense: if “B or C” triggers off A, what exactly should cylc do when A finishes?
3In NWP forecast analysis suites parts of the observation processing and data assimilation subsystem will typically also depend on model background fields generated by the previous forecast.
4A warm cycling model that only writes out one set of restart files, for the very next cycle, does not need to be declared sequential because this early triggering problem cannot arise.
5Note that $CYLC_SUITE_ENVIRONMENT is a string containing embedded newline characters and it has to be handled accordingly. In the bash shell, for instance, it should be echoed in quotes to avoid concatenation to a single line.
6If you accidentally delete a port file while a suite is running, use cylc scan to determine the port number then use it on the command line (--port) or rewrite the port file manually.
7The cylc submit command runs a single task exactly as its suite would, in terms of both job submission method and execution environment.
8If you copy a suite using cylc commands or GUI the entire suite definition directory will be copied.
9Spawning any earlier than this brings no advantage in terms of functional parallelism and would cause uncontrolled proliferation of waiting tasks.
10This is because you don’t want Model[T] waiting around to trigger off Model[T-12] if Model[T-6] has not finished yet. If Model is forced to be sequential this can’t happen because Model[T] won’t exist in the suite until Model[T-6] has finished. But if Model[T-6] fails, it can be spawned-and-removed from the suite so that Model[T] can then trigger off Model[T-12], which is the correct behaviour.